SEQUENCE LISTING 



# SEQUENCE LISTING - - - - (1) GENERAL INFORMATION: 

- - (iii) NUMBER OF SEQUENCES: 2 6 

- - - - (2) INFORMATION FOR SEQ ID NO : 1 : 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1057 base - #pairs 

(B) TYPE: nucleic acid (C) STRANDEDNESS : double 
(D) TOPOLOGY: linear - - (ix) FEATURE: 

(A) NAME/ KEY : CDS (B) LOCATION: 124.. 893 

(ix) FEATURE: (A) NAME / KEY : misc. sub.-- - #feature 

(B) LOCATION: 1. . 1057 

(D) OTHER INFORMATION: - #/note= "product = Arabidopsis 
thaliana - #AP1 . '» 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

- - CTTTCCAATT GGTTCATACC AAAGTCTGAG CTCTTCTTTA TATCTCTCTT GT - 
#AGTTTCTT 60 

- - ATTGGGGGTC TTTGTTTTGT TTGGTTCTTT TAGAGTAAGA AGTTTCTTAA AA - 
#AAGGATCA 120 

- - AAA ATG GGA AGG GGT AGG GTT CAA TTG AAG AG - #G ATA GAG AAC AAG 
ATC 168 

Met Gly Arg Gly Arg Val Gin Leu - #Lys Arg lie Glu Asn Lys lie 
1 - # * 5 - # 10 - # 15 

- - AAT AGA CAA GTG ACA TTC TCG AAA AGA AGA GC - #T GGT CTT TTG AAG AAA 

216 

Asn Arg Gin Val Thr Phe Ser Lys Arg Arg Al - #a Gly Leu Leu Lys Lys 
20 - # 25 - # 30 

- - GCT CAT GAG ATC TCT GTT CTC TGT GAT GCT GA - #A GTT GCT CTT GTT GTC 

264 

Ala His Glu lie Ser Val Leu Cys Asp Ala Gl - #u Val Ala Leu Val Val 
35 - U 40 - # 45 

- - TTC TCC CAT AAG GGG AAA CTC TTC GAA TAC TC - #C ACT GAT TCT TGT ATG 

312 

Phe Ser His Lys Gly Lys Leu Phe Glu Tyr Se - #r Thr Asp Ser Cys Met 
50 - # 55 - # 60 

- - GAG AAG ATA CTT GAA CGC TAT GAG AGG TAC TC - #T TAC GCC GAA AGA CAG 

360 

Glu Lys lie Leu Glu Arg Tyr Glu Arg Tyr Se - #r Tyr Ala Glu Arg Gin 
65 - # 70 - # 75 

- - CTT ATT GCA CCT GAG TCC GAC GTC AAT ACA AA - #C TGG TCG ATG GAG TAT 

408 

Leu lie Ala Pro Glu Ser Asp Val Asn Thr As - #n Trp Ser Met Glu Tyr 
80 - # 85 - # 90 - # 95 

- - AAC AGG CTT AAG GCT AAG ATT GAG CTT TTG GA - #G AGA AAC CAG AGG CAT 

456 

Asn Arg Leu Lys Ala Lys lie Glu Leu Leu Gl - #u Arg Asn Gin Arg His 
100 - # 105 - # 110 

- - TAT CTT GGG GAA GAC TTG CAA GCA ATG AGC CC - #T AAA GAG CTT CAG AAT 

504 

Tyr Leu Gly Glu Asp Leu Gin Ala Met Ser Pr - #o Lys Glu Leu Gin Asn 
115 - # 120 - # 125 
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- - CTG GAG CAG CAG CTT GAC ACT GCT CTT AAG CA - #C ATC CGC ACT AGA AAA 

552 

Leu Glu Gin Gin Leu Asp Thr Ala Leu Lys Hi - #s He Arg Thr Arg Lys 
130 - # 135 - # 140 

- - AAC CAA CTT ATG TAC GAG TCC ATC AAT GAG CT - #C CAA AAA AAG GAG AAG 

600 

Asn Gin Leu Met Tyr Glu Ser He Asn Glu Le - #u Gin Lys Lys Glu Lys 
145 - # 150 - # 155 

- - GCC ATA CAG GAG CAA AAC AGC ATG CTT TCT AA - #A CAG ATC AAG GAG AGG 

648 

Ala He Gin Glu Gin Asn Ser Met Leu Ser Ly - #s Gin He Lys Glu Arg 
160 1 - #65 1 - #70 1 - 

#75 

- - GAA AAA ATT CTT AGG GCT CAA CAG GAG CAG TG - #G GAT CAG CAG AAC 
CAA 6 96 

Glu Lys He Leu Arg Ala Gin Gin Glu Gin Tr - #p Asp Gin Gin Asn Gin 
180 - # 185 - # 190 

- - GGC CAC AAT ATG CCT CCC CCT CTG CCA CCG CA - #G CAG CAC CAA ATC CAG 

744 

Gly His Asn Met Pro Pro Pro Leu Pro Pro Gl - #n Gin His Gin He Gin 
195 - # 200 - # 205 

- - CAT CCT TAC ATG CTC TCT CAT CAG CCA TCT CC - #T TTT CTC AAC ATG GGT 

792 

His Pro Tyr Met Leu Ser His Gin Pro Ser Pr - #o Phe Leu Asn Met Gly 
210 - # 215 - # 220 

- - GGT CTG TAT CAA GAA GAT GAT CCA ATG GCA AT - #G AGG AGG AAT GAT CTC 

840 

Gly Leu Tyr Gin Glu Asp Asp Pro Met Ala Me - #t Arg Arg Asn Asp Leu 
225 - # 230 - # 235 

- - GAA CTG ACT CTT GAA CCC GTT TAC AAC TGC AA - #C CTT GGC TGC TTC GCC 

888 

Glu Leu Thr Leu Glu Pro Val Tyr Asn Cys As - #n Leu Gly Cys Phe Ala 
240 2 - #45 2 - #50 2 - 

#55 

- - GCA TG AAGCATTTCC ATATATATAT TTGTAATCGT CAACAATAAA AAC - #AGTTTGC 

943 Ala 

- - CACATACATA TAAATAGTGG CTAGGCTCTT TTCAT CCAAT TAATATATTT TG - 
#GCAAATGT 1003 

- - TCGATGTTCT TATATCATCA TATATAAATT AG CAGGCTC C TTTCTTTTTT TG - #TA 
1057 - - - - (2) INFORMATION FOR SEQ ID NO : 2 : 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 256 amino - #acids (B) TYPE: amino ac 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

- - Met Gly Arg Gly Arg Val Gin Leu Lys Arg H - #e Glu Asn Lys He Asn 
1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Leu Lys Lys Ala 

20 - # 25 - # 30 

- - His Glu He Ser Val Leu Cys Asp Ala Glu Va - #1 Ala Leu Val Val Phe 

35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Th - #r Asp Ser Cys Met Glu 
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50 - # 55 - # 60 

- - Lys lie Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Arg Gin Leu 
65 - # 70 - # 75 - # 80 

- - lie Ala Pro Glu Ser Asp Val Asn Thr Asn Tr - #p Ser Met Glu Tyr Asn 

85 - # 90 - # 95 

- - Arg Leu Lys Ala Lys lie Glu Leu Leu Glu Ar - #g Asn Gin Arg His Tyr 

100 - # 105 - # 110 

- - Leu Gly Glu Asp Leu Gin Ala Met Ser Pro Ly - #s Glu Leu Gin Asn Leu 

115 - # 120 - # 125 

- - Glu Gin Gin Leu Asp Thr Ala Leu Lys His 11 - Me Arg Thr Arg Lys Asn 

130 - # 135 - # 140 

- - Gin Leu Met Tyr Glu Ser lie Asn Glu Leu Gl - #n Lys Lys Glu Lys Ala 
145 1 - #50 1 - #55 1 - 

#60 

- - lie Gin Glu Gin Asn Ser Met Leu Ser Lys Gl - #n lie Lys Glu Arg 
Glu 165 - # 170 - # 175 

- - Lys He Leu Arg Ala Gin Gin Glu Gin Trp As - #p Gin Gin Asn Gin Gly 

180 - # 185 - # 190 

- - His Asn Met Pro Pro Pro Leu Pro Pro Gin Gl - #n His Gin He Gin His 

195 - # 200 - # 205 

- - Pro Tyr Met Leu Ser His Gin Pro Ser Pro Ph - #e Leu Asn Met Gly Gly 

210 - # 215 - # 220 

- - Leu Tyr Gin Glu Asp Asp Pro Met Ala Met Ar - #g Arg Asn Asp Leu Glu 
225 2 - #30 2 - #35 2 - 

#4 0 

- - Leu Thr Leu Glu Pro Val Tyr Asn Cys Asn Le - #u Gly Cys Phe Ala 
Ala 245 - # 250 - # 255 



- - (2) 


INFORMATION FOR SEQ 


ID NO: 3 : 


(i) 


SEQUENCE CHARACTERISTICS: 


(A) 


LENGTH: 794 base - 


#pairs 


(B) 


TYPE: nucleic acid 


(C) STRANDEDNESS : double 


<D) 


TOPOLOGY: linear 


- - (ii) MOLECULE TYPE: cDNA 


(ix) 


FEATURE : 


(A) NAME /KEY: CDS 


(B) 


LOCATION: 36. .794 


- - (ix) FEATURE : 


(A) 


NAME /KEY : misc.sub. 


-- - #feature 


(B) 


LOCATION: 1 . . 794 




(D) 


OTHER INFORMATION: 


- #/note= "product = Brassica oleracea 



API." - - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

- - TCTTAGAGGA AATAGTTCCT TTAAAAGGGA TAAAA ATG GGA AGG - #GGT AGG GTT 

53 

- # - # Met Gly Arg Gly Arg Val 

- # - # 1 - # 5 

- - CAG TTG AAG AGG ATA GAA AAC AAG ATC AAT AG - #A CAA GTG ACA TTC TCG 

101 

Gin Leu Lys Arg He Glu Asn Lys He Asn Ar - #g Gin Val Thr Phe Ser 
10 - # 15 - # 20 

- - AAA AGA AGA GCT GGT CTT ATG AAG AAA GCT CA - #T GAG ATC TCT GTT CTG 

149 

Lys Arg Arg Ala Gly Leu Met Lys Lys Ala Hi - #s Glu He Ser Val Leu 
25 - # 30 - # 35 

- - TGT GAT GCT GAA GTT GCG CTT GTT GTC TTC TC - #C CAT AAG GGG AAA CTC 
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197 

Cys Asp Ala Glu Val Ala Leu Val Val Phe Se - #r His Lys Gly Lys Leu 
40 - # 45 - # 50 

- - TTT GAA TAC TCC ACT GAT TCT TGT ATG GAG AA - #G ATA CTT GAA CGC TAT 

245 

Phe Glu Tyr Ser Thr Asp Ser Cys Met Glu Ly - #s lie Leu Glu Arg Tyr 
55 - # 60 - # 65 - # 70 

- - GAG AGA TAC TCT TAC GCC GAG AGA CAG CTT AT - #A GCA CCT GAG TCC GAC 

293 

Glu Arg Tyr Ser Tyr Ala Glu Arg Gin Leu 11 - #e Ala Pro Glu Ser Asp 
75 - # 80 - # 85 

- - TCC AAT ACG AAC TGG TCG ATG GAG TAT AAT AG - #G CTT AAG GCT AAG ATT 

341 

Ser Asn Thr Asn Trp Ser Met Glu Tyr Asn Ar - #g Leu Lys Ala Lys lie 
90 - # 95 - # 100 

- - GAG CTT TTG GAG AGA AAC CAG AGG CAC TAT CT - #T GGG GAA GAC TTG CAA 

389 

Glu Leu Leu Glu Arg Asn Gin Arg His Tyr Le - #u Gly Glu Asp Leu Gin 
105 - # 110 - # 115 

- - GCA ATG AGC CCT AAG GAA CTC CAG AAT CTA GA - #G CAA CAG CTT GAT ACT 

437 

Ala Met Ser Pro Lys Glu Leu Gin Asn Leu Gl - #u Gin Gin Leu Asp Thr 
120 - # 125 - # 130 

- - GCT CTT AAG CAC ATC CGC TCT AGA AAA AAC CA - #A CTT ATG TAC GAC TCC 

485 

Ala Leu Lys His lie Arg Ser Arg Lys Asn Gl - #n Leu Met Tyr Asp Ser 
135 1 - #40 1 - #45 1 - 

#50 

- - ATC AAT GAG CTC CAA AGA AAG GAG AAA GCC AT - #A CAG GAA CAA AAC 
AGC 533 

lie Asn Glu Leu Gin Arg Lys Glu Lys Ala II - #e Gin Glu Gin Asn Ser 
155 - # 160 - # 165 

- - ATG CTT TCC AAG CAG ATT AAG GAG AGG GAA AA - #C GTT CTT AGG GCG CAA 

581 

Met Leu Ser Lys Gin lie Lys Glu Arg Glu As - #n Val Leu Arg Ala Gin 
170 - # 175 - # 180 

- - CAA GAG CAA TGG GAC GAG CAG AAC CAT GGC CA - #T AAT ATG CCT CCG CCT 

629 

Gin Glu Gin Trp Asp Glu Gin Asn His Gly Hi - #s Asn Met Pro Pro Pro 
185 - # 190 - # 195 

- - CCA CCC CCG CAG CAG CAT CAA ATC CAG CAT CC - #T TAC ATG CTC TCT CAT 

677 

Pro Pro Pro Gin Gin His Gin lie Gin His Pr - #o Tyr Met Leu Ser His 
200 - # 205 - # 210 

- - CAG CCA TCT CCT TTT CTC AAC ATG GGG GGG CT - #G TAT CAA GAA GAA GAT 

725 

Gin Pro Ser Pro Phe Leu Asn Met Gly Gly Le - #u Tyr Gin Glu Glu Asp 
215 2 - #20 2 - #25 2 - 

#30 

- - CAA ATG GCA ATG AGG AGG AAC GAT CTC GAT CT - #G TCT CTT GAA CCC 
GGT 773 
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Gin Met Ala Met Arg Arg Asn Asp Leu Asp Le - #u Ser Leu Glu Pro Gly 
235 - # 240 - # 245 

- - TAT AAC TGC AAT CTC GGC TGC - # - # 

794 Tyr Asn Cys Asn Leu Gly Cys 25 0 

5 - - - - (2) INFORMATION FOR SEQ ID NO: 4: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 amino - #acids (B) TYPE : amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

10 - - Met Gly Arg Gly Arg Val Gin Leu Lys Arg II - #e Glu Asn Lys He Asn 

1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Met Lys Lys Ala 

20 - # 25 - # 30 

- - His Glu He Ser Val Leu Cys Asp Ala Glu Va - #1 Ala Leu Val Val Phe 
15 35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Th - #r Asp Ser Cys Met Glu 

50 - # 55 - # 60 

- - Lys He Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Arg Gin Leu 
|f 65 - # 70 - # 75 - # 80 
=20 - - He Ala Pro Glu Ser Asp Ser Asn Thr Asn Tr - #p Ser Met Glu Tyr Asn 
l\ 85 - # 90 - # 95 

\ - - Arg Leu Lys Ala Lys He Glu Leu Leu Glu Ar - #g Asn Gin Arg His Tyr 

100 - # 105 - # HO 

I - - Leu Gly Glu Asp Leu Gin Ala Met Ser Pro Ly - #s Glu Leu Gin Asn Leu 
125 115 - # 120 - # 125 

- - Glu Gin Gin Leu Asp Thr Ala Leu Lys His 11 - #e Arg Ser Arg Lys Asn 

II 130 - # 135 - # 140 

1 - - Gin Leu Met Tyr Asp Ser He Asn Glu Leu Gl - #n Arg Lys Glu Lys Ala 

J 145 1 - #50 1 - #55 1 - 

=j30 #6o 

* - - He Gin Glu Gin Asn Ser Met Leu Ser Lys Gl - #n He Lys Glu Arg 

Glu 

165 - # 170 - # 175 

35 - - Asn Val Leu Arg Ala Gin Gin Glu Gin Trp As - #p Glu Gin Asn His Gly 

180 - # 185 - # 190 

- - His Asn Met Pro Pro Pro Pro Pro Pro Gin Gl - #n His Gin He Gin His 

195 - # 200 - # 205 

- - Pro Tyr Met Leu Ser His Gin Pro Ser Pro Ph - #e Leu Asn Met Gly Gly 
40 210 - # 215 - # 220 

- - Leu Tyr Gin Glu Glu Asp Gin Met Ala Met Ar - #g Arg Asn Asp Leu Asp 
225 2 - #30 2 - #35 2 - 

#4 0 - - Leu Ser Leu Glu Pro Gly Tyr Asn Cys Asn Le - #u Gly Cys 

245 - # 250 
45 - - - - (2) INFORMATION FOR SEQ ID NO : 5 : 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 8 base - #pairs 

(B) TYPE: nucleic acid (C) STRANDEDNESS : double 
(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: cDNA 

50 - - (ix) FEATURE: (A) NAME /KEY : CDS 

(B) LOCATION: 1.-766 - - (ix) FEATURE: 
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(A) NAME /KEY : mi sc. sub.-- - # feature 

(B) LOCATION: 1..768 

(D) OTHER INFORMATION: - #/note= "product = Brassica oleracea 
var. botr - #ytis API." 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

- - ATG GGA AGG GGT AGG GTT GAG TTG AAG AGG AT - #A GAA AAC AAG ATC AAT 

48 

Met Gly Arg Gly Arg Val Gin Leu Lys Arg 11 - #e Glu Asn Lys lie Asn 
1 5 - # 10 - # 15 

- - AGA CAA GTG ACA TTC TCG AAA AGA AGA GCT GG - #T CTT ATG AAG AAA GCT 

96 

Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Met Lys Lys Ala 
20 - # 25 - # 30 

- - CAT GAG ATC TCT GTT CTG TGT GAT GCT GAA GT - ffT GCG CTT GTT GTC TTC 

144 

His Glu lie Ser Val Leu Cys Asp Ala Glu Va - #1 Ala Leu Val Val Phe 
35 - # 40 - # 45 

- - TCC CAT AAG GGG AAA CTC TTT GAA TAC CCC AC - #T GAT TCT TGT ATG GAG 

192 

Ser His Lys Gly Lys Leu Phe Glu Tyr Pro Th - #r Asp Ser Cys Met Glu 
50 - # 55 - # 60 

- - GAG ATA CTT GAA CGC TAT GAG AGA TAC TCT TA - #C GCC GAG AGA CAG CTT 

240 

Glu lie Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Arg Gin Leu 
65 - # 70 - # 75 - # 80 

- - ATA GCA CCT GAG TCC GAC TCC AAT ACG AAC TG - #G TCG ATG GAG TAT AAT 

288 

lie Ala Pro Glu Ser Asp Ser Asn Thr Asn Tr - #p Ser Met Glu Tyr Asn 
85 - # 90 - # 95 

- - AGG CTT AAG GCT AAG ATT GAG CTT TTG GAG AG - #A AAC CAG AGG CAC TAT 

336 

Arg Leu Lys Ala Lys lie Glu Leu Leu Glu Ar - #g Asn Gin Arg His Tyr 
100 - # 105 - # 110 

- - CTT GGG GAA GAC TTG CAA GCA ATG AGC CCT AA - #G GAA CTC CAG AAT CTA 

384 

Leu Gly Glu Asp Leu Gin Ala Met Ser Pro Ly - #s Glu Leu Gin Asn Leu 
115 - # 120 - # 125 

- - GAG CAA CAG CTT GAT ACT GCT CTT AAG CAC AT - #C CGC TCT AGA AAA AAC 

432 

Glu Gin Gin Leu Asp Thr Ala Leu Lys His 11 - #e Arg Ser Arg Lys Asn 
130 - # 135 - # 140 

- - CAA CTT ATG TAC GAC TCC ATC AAT GAG CTC CA - #A AGA AAG GAG AAA GCC 

480 

Gin Leu Met Tyr Asp Ser lie Asn Glu Leu Gl - #n Arg Lys Glu Lys Ala 
145 1 - #50 1 - #55 1 - 

#60 

- - ATA CAG GAA CAA AAC AGC ATG CTT TCC AAG CA - #G ATT AAG GAG AGG 
GAA 52 8 

He Gin Glu Gin Asn Ser Met Leu Ser Lys Gl - #n lie Lys Glu Arg Glu 
165 - # 170 - # 175 

- - AAC GTT CTT AGG GCG CAA CAA GAG CAA TGG GA - #C GAG CAG AAC CAT GGC 
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Asn Val Leu Arg Ala Gin Gin Glu Gin Trp As - #p Glu Gin Asn His Gly 
180 - # 185 - # 190 

- - CAT AAT ATG CCT CCG CCT CCA CCC CCG CAG CA - #G CAT CAA ATC CAG CAT 

624 

His Asn Met Pro Pro Pro Pro Pro Pro Gin Gl - #n His Gin lie Gin His 
195 - # 200 - # 205 

- - CCT TAC ATG CTC TCT CAT CAG CCA TCT CCT TT - #T CTC AAC ATG GGA GGG 

672 

Pro Tyr Met Leu Ser His Gin Pro Ser Pro Ph - #e Leu Asn Met Gly Gly 
210 - # 215 - # 220 

- - CTG TAT CAA GAA GAA GAT CAA ATG GCA ATG AG - #G AGG AAC GAT CTC GAT 

720 

Leu Tyr Gin Glu Glu Asp Gin Met Ala Met Ar - #g Arg Asn Asp Leu Asp 
225 2 - #30 2 - #35 2 - 

#40 

- - CTG TCT CTT GAA CCC GTT TAC AAC TGC AAC CT - #T GGC CGT CGC TGC T 

766 

Leu Ser Leu Glu Pro Val Tyr Asn Cys Asn Le - #u Gly Arg Arg Cys 

245 - # 250 - # 255 

- - GA - # - # - # 

768 - - - - (2) INFORMATION FOR SEQ ID NO : 6 : 

- - (i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 255 amino - #acids (B) TYPE: amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

- - Met Gly Arg Gly Arg Val Gin Leu Lys Arg II - #e Glu Asn Lys lie Asn 
1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Met Lys Lys Ala 

20 - # 25 - # 30 

- - His Glu lie Ser Val Leu Cys Asp Ala Glu Va - #1 Ala Leu Val Val Phe 

35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Pro Th - #r Asp Ser Cys Met Glu 

50 - # 55 - # 60 

- - Glu He Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Arg Gin Leu 
65 - # 70 - # 75 - # 80 

- - He Ala Pro Glu Ser Asp Ser Asn Thr Asn Tr - #p Ser Met Glu Tyr Asn 

85 - # 90 - # 95 

- - Arg Leu Lys Ala Lys He Glu Leu Leu Glu Ar - #g Asn Gin Arg His Tyr 

100 - # 105 - # 110 

- - Leu Gly Glu Asp Leu Gin Ala Met Ser Pro Ly - #s Glu Leu Gin Asn Leu 

115 - # 120 - # 125 

- - Glu Gin Gin Leu Asp Thr Ala Leu Lys His 11 - #e Arg Ser Arg Lys Asn 

130 - # 135 - # 140 

- - Gin Leu Met Tyr Asp Ser He Asn Glu Leu Gl - #n Arg Lys Glu Lys Ala 
145 1 - #50 1 - #55 1 - 

#60 

- - He Gin Glu Gin Asn Ser Met Leu Ser Lys Gl - #n He Lys Glu Arg 
Glu 165 - # 170 - # 175 

- - Asn Val Leu Arg Ala Gin Gin Glu Gin Trp As - .#p Glu Gin Asn His Gly 

180 - # 185 - # 190 
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- - His Asn Met Pro Pro Pro Pro Pro Pro Gin Gl - #n His Gin lie Gin His 

195 - ft 200 - ft 205 

- - Pro Tyr Met Leu Ser His Gin Pro Ser Pro Ph - #e Leu Asn Met Gly Gly 

210 - ft 215 - # 220 

- - Leu Tyr Gin Glu Glu Asp Gin Met Ala Met Ar - #9 Arg Asn Asp Leu Asp 
225 2 - #30 2 - #35 2 - 

#4 0 

- - Leu Ser Leu Glu Pro Val Tyr Asn Cys Asn Le - #u Gly Arg Arg Cys 
245 - # 250 - # 255 



(2) 


INFORMATION FOR SEQ 


ID NO: 7: 


(i) 


SEQUENCE 


CHARACTERISTICS ; 


(A) 


LENGTH : 13 4 5 base - 


#pairs 


(B) 


TYPE: nucleic acid 


(C) STRANDEDNESS : double 


(D) 


TOPOLOGY 


linear 


- - (ii) MOLECULE TYPE: cDNA 


<ix> 


FEATURE : 




(A) NAME /KEY : CDS 


(B) 


LOCATION 


149 . .968 


- - (ix) FEATURE: 


(A) 


NAME/ KEY 


misc . sub . - 


- - #feature 


(B) 


LOCATION 


1 . . 1345 




(D) 


OTHER INFORMATION : - 


#/note= "product = Zea mays API." 


(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID NO: 7: 



- - GCACGAGTCC TCCTCCTCCT CGCATCCCAC CCCACCCCAC CTTCTCCTTA AA - 
#GCTACCTG 60 

- - CCTACCCGGC GGTTGCGCGC CGCAATCGAT CGACCGGAAG AGAAAGAGCA GC - 
#TAGCTAGC 120 

- - TAGCAGATCG GAGCACGGCA ACAAGGCG ATG GGG CGC GGC AAG - #GTA CAG CTG 
172 

- # Met Gly Arg - #Gly Lys Val Gin Leu 

- # 1 - # 5 

- - AAG CGG ATA GAG AAC AAG ATA AAC CGG CAG GT - #G ACC TTC TCC AAG CGC 

220 

Lys Arg lie Glu Asn Lys He Asn Arg Gin Va - #1 Thr Phe Ser Lys Arg 
10 - # 15 - # 20 

- - CGG AAC GGC CTG CTC AAG AAG GCG CAC GAG AT - #C TCC GTC CTC TGC GAT 

268 

Arg Asn Gly Leu Leu Lys Lys Ala His Glu II - #e Ser Val Leu Cys Asp 
25 - # 30 - # 35 - ft 40 

- - GCC GAG GTC GCC GTC ATC GTC TTC TCC CCC AA - #G GGC AAG CTC TAC GAG 

316 

Ala Glu Val Ala Val lie Val phe Ser Pro Ly - #s Gly Lys Leu Tyr Glu 
45 - # 50 - # 55 

- - TAC GCC ACC GAC TCC CGC ATG GAC AAA ATT CT - #T GAA CGC TAT GAG CGA 

364 

Tyr Ala Thr Asp Ser Arg Met Asp Lys He Le - ftu Glu Arg Tyr Glu Arg 
60 - # 65 - # 70 

- - TAT TCC TAT GCT GAA AAG GCT CTT ATT TCA GC - #T GAA TCT GAA AGT GAG 

412 

Tyr Ser Tyr Ala Glu Lys Ala Leu He Ser Al - #a Glu Ser Glu Ser Glu 
75 - # 80 -ft 85 

- - GGA AAT TGG TGC CAC GAA TAC AGG AAA CTG AA - #G GCC AAA ATT GAG ACC 

460 

Gly Asn Trp Cys His Glu Tyr Arg Lys Leu Ly - #s Ala Lys He Glu Thr 
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90 - # 95 - # 100 

- - ATA CAA AAA TGC CAC AAG CAC CTG ATG GGA GA - #G GAT CTA GAG TCT TTG 

508 

lie Gin Lys Cys His Lys His Leu Met Gly Gl - #u Asp Leu Glu Ser Leu 
105 1 - #10 1 - #15 1 - 

#20 

- - AAT CCC AAA GAG CTC CAG CAA CTA GAG CAG CA - #G CTG GAT AGC TCA 
CTG 556 

Asn Pro Lys Glu Leu Gin Gin Leu Glu Gin Gl - #n Leu Asp Ser Ser Leu 
125 - # 130 - # 135 

- - AAG CAC ATC AGA TCA AGG AAG AGC CAC CTT AT - #G GCC GAG TCT ATT TCT 

604 

Lys His He Arg Ser Arg Lys Ser His Leu Me - #t Ala Glu Ser He Ser 
140 - # 145 - # 150 

- - GAG CTA CAG AAG AAG GAG AGG TCA CTG CAG GA - #G GAG AAC AAG GCT CTG 

652 

Glu Leu Gin Lys Lys Glu Arg Ser Leu Gin Gl - #u Glu Asn Lys Ala Leu 
155 - # 160 - # 165 

- - CAG AAG GAA CTT GCG GAG AGG CAG AAG GCC GT - #C GCG AGC CGG CAG CAG 

700 

Gin Lys Glu Leu Ala Glu Arg Gin Lys Ala Va - #1 Ala Ser Arg Gin Gin 
170 - # 175 - # 180 

- - CAG CAA CAG CAG CAG GTG CAG TGG GAC CAG CA - #G ACA CAT GCC CAG GCC 

748 

Gin Gin Gin Gin Gin Val Gin 'Trp Asp Gin Gl - #n Thr His Ala Gin Ala 
185 1 - #90 1 - #95 2 - 

#00 

- - CAG ACA AGC TCA TCA TCG TCC TCC TTC ATG AT - #G AGG CAG GAT CAG 
CAG 7 96 

Gin Thr Ser Ser Ser Ser Ser Ser Phe Met Me - #t Arg Gin Asp Gin Gin 
205 - # 210 - if 215 

- - GGA CTG CCG CCT CCA CAC AAC ATC TGC TTC CC - #G CCG TTG ACA ATG GGA 

844 

Gly Leu Pro Pro Pro His Asn He Cys Phe Pr - #o Pro Leu Thr Met Gly 
220 - # 225 - # 230 

- - GAT AGA GGT GAA GAG CTG GCT GCG GCG GCG GC - #G GCG CAG CAG CAG CAG 

892 

Asp Arg Gly Glu Glu Leu Ala Ala Ala Ala Al - #a Ala Gin Gin Gin Gin 
235 - # 240 - # 245 

- - CCA CTG CCG GGG CAG GCG CAA CCG CAG CTC CG - #C ATC GCA GGT CTG CCA 

940 

Pro Leu Pro Gly Gin Ala Gin Pro Gin Leu Ar - #g He Ala Gly Leu Pro 
250 - # 255 - # 260 

- - CCA TGG ATG CTG AGC CAC CTC AAT GCA T AAGG - #AGAGGG TCGATGAACA 

988 Pro Trp Met Leu Ser His Leu Asn Ala 

265 2 - #70 

- - CATCGACCTC CTCTCTCTCT CTCTCTCGTC ATGGATCATG ACGTACGCGT AC - 
#CATATGGT 104 8 

- - TGCTGTGCCT GCCCCCATCG AT CGCGAGCA ATGGCACGCT CATGCAAGTG AT - 
#CATTGCTC 110 8 

- - CCCGTTGGTT AAACCCTAGC CTATGTTCAT GGCGTCAGCA ACTAAGCTAA AC - 
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#TATTGTTA 116 8 

- - TGTTTGCAAG AAAGGGTAAA CCCGCTAGCT GTGTAATCTT GTCCAGCTAT CA - 
#GTATGCTT 1228 

- - GTTACTGCCC AGTTACCCTT GAATCTAGCG GCGCTTTTGG TGAGAGGGTG CA - 
#GTTTACTT 128 8 

- - TAAACATGGT TCGTGACTTG CTGTAAATAG TAGTATTAAT CGATTTGGGC AT - #CTAAA 
134 5 - - - - (2) INFORMATION FOR SEQ ID NO : 8 : 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 273 amino - #acids (B) TYPE: amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE : protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

- - Met Gly Arg Gly Lys Val Gin Leu Lys Arg II - #e Glu Asn Lys He Asn 
1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Asn Gl - #y Leu Leu Lys Lys Ala 

20 - # 25 - # 30 

- - His Glu He Ser Val Leu Cys Asp Ala Glu Va - #1 Ala Val He Val Phe 

35 - # 40 - # 45 

- - Ser Pro Lys Gly Lys Leu Tyr Glu Tyr Ala Th - #r Asp Ser Arg Met Asp 

SO - # 55 - # 60 

- - Lys He Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Lys Ala Leu 
65 - # 70 - # 75 - # 80 

- - lie Ser Ala Glu Ser Glu Ser Glu Gly Asn Tr - #p Cys His Glu Tyr Arg 

85 - # 90 - # 95 

- - Lys Leu Lys Ala Lys He Glu Thr He Gin Ly - #s Cys His Lys His Leu 

100 - # 105 - # 110 

- - Met Gly Glu Asp Leu Glu Ser Leu Asn Pro Ly - #s Glu Leu Gin Gin Leu 

115 - # 120 - # 125 

- - Glu Gin Gin Leu Asp Ser Ser Leu Lys His 11 - #e Arg Ser Arg Lys Ser 

130 - # 135 - # 140 

- - His Leu Met Ala Glu Ser He Ser Glu Leu Gl - #n Lys Lys Glu Arg Ser 
145 1 - #50 1 - #55 1 - 

#60 

- - Leu Gin Glu Glu Asn Lys Ala Leu Gin Lys Gl - #u Leu Ala Glu Arg 
Gin 165 - # 170 - # 175 

- - Lys Ala Val Ala Ser Arg Gin Gin Gin Gin Gl - #n Gin Gin Val Gin Trp 

180 - # 185 - # 190 

- - Asp Gin Gin Thr His Ala Gin Ala Gin Thr Se - #r Ser Ser Ser Ser Ser 

195 - # 200 - # 205 

- - Phe Met Met Arg Gin Asp Gin Gin Gly Leu Pr - #o Pro Pro His Asn He 

210 - # 215 - # 220 

- - Cys Phe Pro Pro Leu Thr Met Gly Asp Arg Gl - #y Glu Glu Leu Ala Ala 
225 2 - #30 2 - #35 2 - 

#40 

- - Ala Ala Ala Ala Gin Gin Gin Gin Pro Leu Pr - #o Gly Gin Ala Gin 
Pro 245 - # 250 - # 255 

- - Gin Leu Arg He Ala Gly Leu Pro Pro Trp Me - #t Leu Ser His Leu Asn 

260 - # 265 - # 270 - - Ala 

- - - - (2) INFORMATION FOR SEQ ID NO : 9 : 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 779 base - #pairs 
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(B) TYPE: nucleic acid (C) STRAND EDNESS : double 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE; cDNA 

(ix) FEATURE: (A) NAME /KEY : CDS 

(B) LOCATION: 10 . . 775 - - (ix) FEATURE: 

(A) NAME/KEY: unsure (B) LOCATION: 778.. 779 

(D) OTHER INFORMATION: - #/note= "N = one or more 
nucleotides. - #" - - (ix) FEATURE: 

(A) NAME /KEY : mi sc . sub . - - - #feature 

(B) LOCATION: 1 . .77 9 

(D) OTHER INFORMATION: - #/note= "product = Arabidopsis 
thaliana - #CAL." 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

- - TTAAGAGAA ATG GGA AGG GGT AGG GTT GAA TTG AAG - # AGG ATA GAG AAC 

48 

Met Gly Arg Gly Arg - #Val Glu Leu Lys Arg lie Glu Asn 
1 - # 5 - # 10 

- - AAG ATC AAT AGA CAA GTG ACA TTC TCG AAA AG - #A AGA ACT GGT CTT TTG 

96 

Lys lie Asn Arg Gin Val Thr Phe Ser Lys Ar - #g Arg Thr Gly Leu Leu 
15 - # 20 - # 25 

- - AAG AAA GCT CAG GAG ATC TCT GTT CTT TGT GA - #T GCC GAG GTT TCC CTT 

144 

Lys Lys Ala Gin Glu lie Ser Val Leu Cys As - #p Ala Glu Val Ser Leu 
30 - # 35 - # 40 - # 45 

- - ATT GTC TTC TCC CAT AAG GGC AAA TTG TTC GA - #G TAC TCC TCT GAA TCT 

192 

lie Val Phe Ser His Lys Gly Lys Leu Phe Gl - #u Tyr Ser Ser Glu Ser 
50 - # 55 - # 60 

- - TGC ATG GAG AAG GTA CTA GAA CGC TAC GAG AG - #G TAT TCT TAC GCC GAG 

240 

Cys Met Glu Lys Val Leu Glu Arg Tyr Glu Ar - #g Tyr Ser Tyr Ala Glu 
65 - # 70 - # 75 

- - AGA CAG CTG ATT GCA CCT GAC TCT CAC GTT AA - #T GCA CAG ACG AAC TGG 

288 

Arg Gin Leu lie Ala Pro Asp Ser His Val As - #n Ala Gin Thr Asn Trp 
80 - # 85 - # 90 

- - TCA ATG GAG TAT AGC AGG CTT AAG GCC AAG AT - #T GAG CTT TTG GAG AGA 

336 

Ser Met Glu Tyr Ser Arg Leu Lys Ala Lys 11 - #e Glu Leu Leu Glu Arg 
95 - # 100 - # 105 

- - AAC CAA AGG CAT TAT CTG GGA GAA GAG TTG GA - #A CCA ATG AGC CTC AAG 

384 

Asn Gin Arg His Tyr Leu Gly Glu Glu Leu Gl - #u Pro Met Ser Leu Lys 
110 1 - #15 1 - #20 1 - 

#25 

- - GAT CTC CAA AAT CTG GAG CAG CAG CTT GAG AC - #T GCT CTT AAG CAC 
ATT 432 

Asp Leu Gin Asn Leu Glu Gin Gin Leu Glu Th - #r Ala Leu Lys His lie 
130 - # 135 - # 140 

- - CGC TCC AGA AAA AAT CAA CTC ATG AAT GAG TC - #C CTC AAC CAC CTC CAA 

480 
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Arg Ser Arg Lys Asn Gin Leu Met Asn Glu Se - #r Leu Asn His Leu Gin 
145 - # 150 - # 155 

- - AGA AAG GAG AAG GAG ATA CAG GAG GAA AAC AG - #C ATG CTT ACC AAA CAG 

528 

Arg Lys Glu Lys Glu He Gin Glu Glu Asn Se - #r Met Leu Thr Lys Gin 
160 - ft 165 - # 170 

- - ATA AAG GAG AGG GAA AAC ATC CTA AAG ACA AA - #A CAA ACC CAA TGT GAG 

576 

He Lys Glu Arg Glu Asn He Leu Lys Thr Ly - #s Gin Thr Gin Cys Glu 
175 - # 180 - # 185 

- - CAG CTG AAC CGC AGC GTC GAC GAT GTA CCA CA - #G CCA CAA CCA TTT CAA 

624 

Gin Leu Asn Arg Ser Val Asp Asp Val Pro Gl - #n Pro Gin Pro Phe Gin 
190 1 - #95 2 - #00 2 - 

#05 

- - CAC CCC CAT CTT TAC ATG ATC GCT CAT CAG AC - #T TCT CCT TTC CTA 
AAT 672 

His Pro His Leu Tyr Met He Ala His Gin Th - #r Ser Pro Phe Leu Asn 
210 - # 215 - # 220 

- - ATG GGT GGT TTG TAC CAA GGA GAA GAC CAA AC - #G GCG ATG AGG AGG AAC 

720 

Met Gly Gly Leu Tyr Gin Gly Glu Asp Gin Th - #r Ala Met Arg Arg Asn 
225 - # 230 - # 235 

- - AAT CTG GAT CTG ACT CTT GAA CCC ATT TAC AA - #T TAC CTT GGC TGT TAC 

768 

Asn Leu Asp Leu Thr Leu Glu Pro lie Tyr As - #n Tyr Leu Gly Cys Tyr 
240 - # 245 - # 250 

- - GCC GCT T GANN - # - # - # 779 
Ala Ala 255 - - - - (2) INFORMATION FOR SEQ ID NO: 10: 

- - (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 255 amino - Uacids (B) TYPE; amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

- - Met Gly Arg Gly Arg Val Glu Leu Lys Arg 11 - #e Glu Asn Lys He Asn 
1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Thr Gl - #y Leu Leu Lys Lys Ala 

20 - # 25 - it 30 

- - Gin Glu He Ser Val Leu Cys Asp Ala Glu Va - #1 Ser Leu He Val Phe 

35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Se - #r Glu Ser Cys Met Glu 

50 - # 55 - # 60 

- - Lys Val Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Arg Gin Leu 
65 - # 70 - # 75 - # 80 

- - He Ala Pro Asp Ser His Val Asn Ala Gin Th - #r Asn Trp Ser Met Glu 

85 - # 90 - # 95 

- - Tyr Ser Arg Leu Lys Ala Lys He Glu Leu Le - #u Glu Arg Asn Gin Arg 

100 - % 105 - # 110 

- - His Tyr Leu Gly Glu Glu Leu Glu Pro Met Se - #r Leu Lys Asp Leu Gin 

115 - # 120 - # 125 

- - Asn Leu Glu Gin Gin Leu Glu Thr Ala Leu Ly - #s His He Arg Ser Arg 

130 - # 135 - # 140 
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- - Lyg Asn Gin Leu Met Asn Glu Ser Leu Asn Hi - Us Leu Gin Arg Lys Glu 
145 1 - #50 1 - #55 1 - 

#60 

- - Lys Glu He Gin Glu Glu Asn Ser Met Leu Th - #r Lys Gin He Lys 
Glu 165 - # 170 - # 175 

- - Arg Glu Asn He Leu Lys Thr Lys Gin Thr Gl - #n Cys Glu Gin Leu Asn 

180 - # 185 - # 190 

- - Arg Ser Val Asp Asp Val Pro Gin Pro Gin Pr - #o Phe Gin His Pro His 

195 - # 200 - # 205 

- - Leu Tyr Met He Ala His Gin Thr Ser Pro Ph - #e Leu Asn Met Gly Gly 

210 - # 215 - # 220 

- - Leu Tyr Gin Gly Glu Asp Gin Thr Ala Met Ar - #g Arg Asn Asn Leu Asp 
225 2 - #30 2 - #35 2 - 

#40 

- - Leu Thr Leu Glu Pro He Tyr Asn Tyr Leu Gl - #y Cys Tyr Ala Ala 

245 - # 250 - # 255 

- - - - (2) INFORMATION FOR SEQ ID NO: 11: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 756 base - #pairs 

(8) TYPE: nucleic acid (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: CDNA 

- - (ix) FEATURE: (A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 754 - - (ix) FEATURE : 

(A) NAME / KEY : misc.sub.~- - #feature 

(B) LOCATION: 1 . . 756 

(D) OTHER INFORMATION: - #/note= "product = Brassica oleracea 

CAL . " - - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 

- - ATG GGA AGG GGT AGG GTT GAA ATG AAG AGG AT - #A GAG AAC AAG ATC AAC 

48 

Met Gly Arg Gly Arg Val Glu Met Lys Arg 11 - #e Glu Asn Lys He Asn 
1 5 - # 10 - # 15 

- - CGA CAA GTG ACG TTT TCG AAA AGA AGA GCT GG - #T CTT TTG AAG AAA GCC 

96 

Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Leu Lys Lys Ala 
20 - # 25 - # 30 

- - CAT GAG ATC TCG ATC CTT TGT GAT GCT GAG GT - #T TCC CTT ATT GTC TTC 

144 

His Glu He Ser He Leu Cys Asp Ala Glu Va - #1 Ser Leu He Val Phe 
35 - # 40 - # 45 

- - TCC CAT AAG GGG AAA CTG TTC GAG TAC TCG TC - #T GAA TCT TGC ATG GAG 

192 

Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Se - #r Glu Ser Cys Met Glu 
50 - # 55 - # 60 

- - AAG GTA CTA GAA CAC TAC GAG AGG TAC TCT TA - #C GCC GAG AAA CAG CTA 

240 

Lys Val Leu Glu His Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Lys Gin Leu 
65 - # 70 - # 75 - # 80 

- - AAA GTT CCA GAC TCT CAC GTC AAT GCA CAA AC - #G AAC TGG TCA GTG GAA 

288 

Lys Val Pro Asp Ser His Val Asn Ala Gin Th - #r Asn Trp Ser Val Glu 
85 - # 90 - # 95 
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- - TAT AGC AGG CTT AAG GCT AAG ATT GAG CTT TT - #G GAG AGA AAC CAA AGG 

336 

Tyr Ser Arg Leu Lys Ala Lys lie Glu Leu Le - #u Glu Arg Asn Gin Arg 
100 - # 105 - # 110 

- - CAT TAT CTG GGC GAA GAT TTA GAA TCA ATC AG - #C ATA AAG GAG CTA CAG 

384 

His Tyr Leu Gly Glu Asp Leu Glu Ser lie Se - #r He Lys Glu Leu Gin 
115 - # 120 - # 125 

- - AAT CTG GAG CAG CAG CTT GAC ACT TCT CTT AA - #A CAT ATT CGC TCG AGA 

432 

Asn Leu Glu Gin Gin Leu Asp Thr Ser Leu Ly - #s His He Arg Ser Arg 
130 - # 135 - # 140 

- - AAA AAT CAA CTA ATG CAC GAG TCC CTC AAC CA - #C CTC CAA AGA AAG GAG 

480 

Lys Asn Gin Leu Met His Glu Ser Leu Asn Hi - #s Leu Gin Arg Lys Glu 
145 1 - #50 1 - #55 1 - 

#60 

- - AAA GAA ATA CTG GAG GAA AAC AGC ATG CTT GC - #C AAA CAG ATA AGG 
GAG 528 

Lys Glu He Leu Glu Glu Asn Ser Met Leu Al - #a Lys Gin He Arg Glu 
165 - # 170 - # 175 

- - AGG GAG AGT ATC CTA AGG ACA CAT CAA AAC CA - #A TCA GAG CAG CAA AAC 

575 

Arg Glu Ser He Leu Arg Thr His Gin Asn Gl - #n Ser Glu Gin Gin Asn 
180 - # 185 - # 190 

- - CGC AGC CAC CAT GTA GCT CCT CAG CCG CAA CC - #G CAG TTA AAT CCT TAC 

624 

Arg Ser His His Val Ala Pro Gin Pro Gin Pr - #o Gin Leu Asn Pro Tyr 
195 - # 200 - # 205 

- - ATG GCA TCA TCT CCT TTC CTA AAT ATG GGT GG - #C ATG TAC CAA GGA GAA 

672 

Met Ala Ser Ser Pro Phe Leu Asn Met Gly Gl - #y Met Tyr Gin Gly Glu 
210 - # 215 - # 220 

- - TAT CCA ACG GCG GTG AGG AGG AAC CGT CTC GA - #T CTG ACT CTT GAA CCC 

720 

Tyr Pro Thr Ala Val Arg Arg Asn Arg Leu As - #p Leu Thr Leu Glu Pro 
225 2 - #30 2 - #35 2 - 

#40 - - ATT TAC AAC TGC AAC CTT GGT TAC TTT GCC GC - #A T GA 

- # 756 He Tyr Asn Cys Asn Leu Gly Tyr Phe Ala Al - #a 

245 - # 250 

- - - - (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 251 amino - #acids (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: protein 

- - (xi} SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

- - Met Gly Arg Gly Arg Val Glu Met Lys Arg II - #e Glu Asn Lys He Asn 
1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Leu Lys Lys Ala 

20 - # 25 - # 30 
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- - His Glu lie Ser lie Leu Cys Asp Ala Glu Va - #1 Ser Leu He Val Phe 

35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Se - #r Glu Ser Cys Met Glu 

50 - # 55 - # 60 

5 - - Lys Val Leu Glu His Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Lys Gin Leu 

65 - # 70 - # 75 - # 80 

- - Lys Val Pro Asp Ser His Val Asn Ala Gin Th - #r Asn Trp Ser Val Glu 

85 - # 90 - # 95 

- - Tyr Ser Arg Leu Lys Ala Lys He Glu Leu Le - #u Glu Arg Asn Gin Arg 
10 100 - # 105 - # 110 

- - His Tyr Leu Gly Glu Asp Leu Glu Ser He Se - #r He Lys Glu Leu Gin 

115 - # 120 - # 125 

- - Asn Leu Glu Gin Gin Leu Asp Thr Ser Leu Ly - #s His He Arg Ser Arg 

130 - # 135 - # 140 

15 - - Lys Asn Gin Leu Met His Glu Ser Leu Asn Hi - #s Leu Gin Arg Lys Glu 

145 1 - #50 1 - #55 1 - 

#60 

- - Lys Glu He Leu Glu Glu Asn Ser Met Leu Al - #a Lys Gin He Arg 
% Glu 165 ~ # 170 - # 175 

-r-^0 - - Arg Glu Ser He Leu Arg Thr His Gin Asn Gl - #n Ser Glu Gin Gin Asn 

7| 180 - # 185 - # 190 

.\ - - Arg Ser His His Val Ala Pro Gin Pro Gin Pr - #o Gin Leu Asn Pro Tyr 

..[= 195 - # 200 - # 205 

F1 - - Met Ala Ser Ser Pro Phe Leu Asn Met Gly Gl - #y Met Tyr Gin Gly Glu 

jZ5 210 - # 215 - # 220 

- - Tyr Pro Thr Ala Val Arg Arg Asn Arg Leu As - #p Leu Thr Leu Glu Pro 
=1 225 2 - #30 2 - #35 2 - 

#4 0 - - He Tyr Asn Cys Asn Leu Gly Tyr Phe Ala Al - #a 

^ 245 - # 250 

^30 - - - - (2) INFORMATION FOR SEQ ID NO: 13: 

= " - - <i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 756 base - #pairs 

(B) TYPE: nucleic acid (C) STRAWDEDNESS .- double 
(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: cDNA 

35 - - Cix) FEATURE: (A) NAME /KEY : CDS 

(B) LOCATION: 1..451 - - (ix) FEATURE: 

(A) NAME /KEY : misc. sub.-- - #feature 

(B) LOCATION: 1 . . 756 

(D) OTHER INFORMATION: - #/note= "product = Brassica oleracea 
40 var. botr - #ytis CAL." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

- - ATG GGA AGG GGT AGG GTT GAA ATG AAG AGG AT - #A GAG AAC AAG ATC AAC 

48 

Met Gly Arg Gly Arg Val Glu Met Lys Arg II - #e Glu Asn Lys He Asn 
45 1 5 - # 10 - # 15 

- - AGA CAA GTG ACG TTT TCG AAA AGA AGA GCT GG - #T CTT TTG AAG AAA GCC 

96 

Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Leu Lys Lys Ala 
20 - # 25 - # 30 

50 - - CAT GAG ATC TCG ATT CTT TGT GAT GCT GAG GT - #T TCC CTT ATT GTC TTC 

144 
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His Glu He Ser He Leu Cys Asp Ala Glu Va - #1 Ser Leu He Val Phe 
35 - # 40 - # 45 

- - TCC CAT AAG GGG AAA CTG TTC GAG TAC TCG TC - #T GAA TCT TGC ATG GAG 

192 

Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Se - #r Glu Ser Cys Met Glu 
50 - # 55 - # 60 

- - AAG GTA CTA GAA CGC TAC GAG AGG TAC TCT TA - #C GCC GAG AAA CAG CTA 

240 

Lys Val Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Lys Gin Leu 
65 - # 70 - # 75 - # 80 

- - AAA GCT CCA GAC TCT CAC GTC AAT GCA CAA AC - #G AAC TGG TCA ATG GAA 

288 

Lys Ala Pro Asp Ser His Val Asn Ala Gin Th - #r Asn Trp Ser Met Glu 
85 - # 90 - ff 95 

- - TAT AGC AGG CTT AAG GCT AAG ATT GAG CTT TG - #G GAG AGG AAC CAA AGG 

336 

Tyr Ser Arg Leu Lys Ala Lys lie Glu Leu Tr - #p Glu Arg Asn Gin Arg 
100 - # 105 - # HO 

- - CAT TAT CTG GGA GAA GAT TTA GAA TCA ATC AG - #C ATA AAG GAG CTA CAG 

384 

His Tyr Leu Gly Glu Asp Leu Glu Ser He Se - #r He Lys Glu Leu Gin 
115 - # 120 - # 125 

- - AAT CTG GAG CAG CAG CTT GAC ACT TCT CTT AA - #A CAT ATT CGC TCC AGA 

432 

Asn Leu Glu Gin Gin Leu Asp Thr Ser Leu Ly - #s His He Arg Ser Arg 
130 - # 135 - # 140 

- - AAA AAT CAA CTA ATG CAC T AGTCCCTCAA CCACCTCCAA - #AGAAAGGAGA 
481 Lys Asn Gin Leu Met His 145 1 - #50 

- - AAGAAATACT GGAGGAAAAC AGCATGCTTG CCAAACAGAT AAAGGAGAGG GA - 
#GAGTATCC 541 

- - TAAGGACACA TCAAAACCAA TCAGAG CAGC AAAACCGCAG CCACCATGTA GC - 
#TCCTCAGC 601 

- - CGCAACCGCA GTTAAATCCT TACATGGCAT CATCTCCTTT CCTAAATATG GG - 
#TGGCATGT 661 

- - ACCAAGGAGA ATATCCAACG GCGGTGAGGA GGAACCGTCT CGAT CTGACT CT - 
#TGAACCCA 721 - - TTTACAACTG CAACCTTGGT TACTTTGCCG CATGA - i 
- # 756 - - - - (2) INFORMATION FOR SEQ ID NO:14: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 amino - #acids 

(B) TYPE: amino acid {D) TOPOLOGY: linear 

- - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

- - Met Gly Arg Gly Arg Val Glu Met Lys Arg II - #e Glu Asn Lys He 
Asn 1 5 - # 10 - # 15 

- - Arg Gin Val Thr Phe Ser Lys Arg Arg Ala Gl - #y Leu Leu Lys Lys Ala 

20 - # 25 - # 30 

- - His Glu He Ser He Leu Cys Asp Ala Glu Va - #1 Ser Leu He Val Phe 

35 - # 40 - # 45 

- - Ser His Lys Gly Lys Leu Phe Glu Tyr Ser Se - #r Glu Ser Cys Met Glu 

50 - # 55 - # 60 

- - Lys Val Leu Glu Arg Tyr Glu Arg Tyr Ser Ty - #r Ala Glu Lys Gin Leu 
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65 - # 70 - H 75 - # 80 

- - Lys Ala Pro Asp Ser His Val Asn Ala Gin Th - #r Asn Trp Ser Met Glu 

85 - # 90 - # 95 

- - Tyr Ser Arg Leu Lys Ala Lys lie Glu Leu Tr - #p Glu Arg Asn Gin Arg 

100 - # 105 - # 110 

- - His Tyr Leu Gly Glu Asp Leu Glu Ser lie Se - #r lie Lys Glu Leu Gin 

115 - # 120 - # 125 

- - Asn Leu Glu Gin Gin Leu Asp Thr Ser Leu Ly - #s His lie Arg Ser Arg 

130 - # 135 - # 140 

- - Lys Asn Gin Leu Met His 14 5 1 - #50 

- - - - (2) INFORMATION FOR SEQ ID NO: 15: 

- - (i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1500 base - #pairs 

(B) TYPE: nucleic acid (C) STRANDEDNESS : double 
(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: (A) NAME/ KEY : CDS 

(B) LOCATION: 72 .. 1343 - - (ix) FEATURE: 

(A) NAME /KEY : misc. sub.-- - #feature 

(B) LOCATION: 1. .1500 

CD) OTHER INFORMATION: - #/note= "product = Arabidopsis 
thaliana - # LEAFY (LFY) . " 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

- - AAAGCAATCT GCTCAAAAGA GTAAAGAAAG AGAGAAAAAG AGAGTGATAG AG - 
#AGAGAGAG 60 

- - AAAAATAGAT T ATG GAT CCT GAA GGT TTC ACG AGT - #GGC TTA TTC CGG 
TGG 110 

Met Asp Pro - #Glu Gly Phe Thr Ser Gly Leu Phe Arg Trp 
1 - # 5 - # 10 

- - AAC CCA ACG AGA GCA TTG GTT CAA GCA CCA CC - #T CCG GTT CCA CCT CCG 

158 

Asn Pro Thr Arg Ala Leu Val Gin Ala Pro Pr - #o Pro Val Pro Pro Pro 
15 - # 20 - # 25 

- - CTG CAG CAA CAG CCG GTG ACA CCG CAG ACG GC - #T GCT TTT GGG ATG CGA 

206 

Leu Gin Gin Gin Pro val Thr Pro Gin Thr Al - #a Ala Phe Gly Met Arg 
30 - # 35 - # 40 - # 45 

- - CTT GGT GGT TTA GAG GGA CTA TTC GGT CCA TA - #C GGT ATA CGT TTC TAC 

254 

Leu Gly Gly Leu Glu Gly Leu Phe Gly Pro Ty - #r Gly He Arg Phe Tyr 
50 - # 55 - # 60 

- - ACG GCG GCG AAG ATA GCG GAG TTA GGT TTT AC - #G GCG AGC ACG CTT GTG 

302 

Thr Ala Ala Lys He Ala Glu Leu Gly Phe Th - #r Ala Ser Thr Leu Val 
65 - # 70 - # 75 

- - GGT ATG AAG GAC GAG GAG CTT GAA GAG ATG AT - #G AAT AGT CTC TCT CAT 

350 

Gly Met Lys Asp Glu Glu Leu Glu Glu Met Me - #t Asn Ser Leu Ser His 
80 - # 85 - # 90 

- - ATC TTT CGT TGG GAG CTT CTT GTT GGT GAA CG - #G TAC GGT ATC AAA GCT 

398 

He Phe Arg Trp Glu Leu Leu Val Gly Glu Ar - #g Tyr Gly He Lys Ala 
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95 - # 100 - # 105 

- - GCC GTT AGA GCT GAA CGG AGA CGA TTG CAA GA - #A GAG GAG GAA GAG GAA 

446 

Ala Val Arg Ala Glu Arg Arg Arg Leu Gin Gl - #u Glu Glu Glu Glu Glu 
110 1 - #15 1 - #20 1 - 

#25 

- - TCT TCT AGA CGC CGT CAT TTG CTA CTC TCC GC - #C GCT GGT GAT TCC 
GGT 4 94 

Ser Ser Arg Arg Arg His Leu Leu Leu Ser Al - #a Ala Gly Asp Ser Gly 
130 - # 135 - # 140 

- - ACT CAT CAC GCT CTT GAT GCT CTC TCC CAA GA - #A GAT GAT TGG ACA GGG 

542 

Thr His His Ala Leu Asp Ala Leu Ser Gin Gl - #u Asp Asp Trp Thr Gly 
145 - # 150 - # 155 

- - TTA TCT GAG GAA CCG GTG CAG CAA CAA GAC CA - #G ACT GAT GCG GCG GGG 

590 

Leu Ser Glu Glu Pro Val Gin Gin Gin Asp Gl - #n Thr Asp Ala Ala Gly 
160 - # 165 - # 170 

- - AAT AAC GGC GGA GGA GGA AGT GGT TAC TGG GA - #C GCA GGT CAA GGA AAG 

638 

Asn Asn Gly Gly Gly Gly Ser Gly Tyr Trp As - #p Ala Gly Gin Gly Lys 
175 - # 180 - # 185 

- - ATG AAG AAG CAA CAG CAG CAG AGA CGG AGA AA - #G AAA CCA ATG CTG ACG 

686 

Met Lys Lys Gin Gin Gin Gin Arg Arg Arg Ly - #s Lys Pro Met Leu Thr 
190 1 - #95 2 - #00 2 - 

#05 

- - TCA GTG GAA ACC GAC GAA GAC GTC AAC GAA GG - #T GAG GAT GAC GAC 
GGG 734 

Ser Val Glu Thr Asp Glu Asp Val Asn Glu Gl - #y Glu Asp Asp Asp Gly 
210 -H 215 - # 220 

- - ATG GAT AAC GGC AAC GGA GGT AGT GGT TTG GG - #G ACA GAG AGA CAG AGG 

782 

Met Asp Asn Gly Asn Gly Gly Ser Gly Leu Gl - #y Thr Glu Arg Gin Arg 
225 - # 230 - # 235 

- - GAG CAT CCG TTT ATC GTA ACG GAG CCT GGG GA - #A GTG GCA CGT GGC AAA 

830 

Glu His Pro Phe lie Val Thr Glu Pro Gly Gl - #u Val Ala Arg Gly Lys 
240 - # 245 - # 250 

- - AAG AAC GGC TTA GAT TAT CTG TTC CAC TTG TA - #C GAA CAA TGC CGT GAG 

878 

Lys Asn Gly Leu Asp Tyr Leu Phe His Leu Ty - #r Glu Gin Cys Arg Glu 
255 - # 260 - # 265 

- - TTC CTT CTT CAG GTC CAG ACA ATT GCT AAA GA - #C CGT GGC GAA AAA TGC 

926 

Phe Leu Leu Gin Val Gin Thr lie Ala Lys As - #p Arg Gly Glu Lys Cys 
270 2 - #75 2 - #80 2 - 

#85 

- - CCC ACC AAG GTG ACG AAC CAA GTA TTC AGG TA - #C GCG AAG AAA TCA 
GGA 974 

Pro Thr Lys Val Thr Asn Gin Val Phe Arg Ty - #r Ala Lys Lys Ser Gly 
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290 - # 295 - # 300 

- - GCG AGT TAC ATA AAC AAG CCT AAA ATG CGA CA - #C TAC GTT CAC TGT TAC 
1022 

Ala Ser Tyr lie Asn Lys Pro Lys Met Arg Hi - #s Tyr Val His Cys Tyr 
305 - # 310 - # 315 

- - GCT CTC CAC TGC CTA GAC GAA GAA GCT TCA AA - #T GCT CTC AGA AGA GCG 
1070 

Ala Leu His Cys Leu Asp Glu Glu Ala Ser As - #n Ala Leu Arg Arg Ala 
320 - # 325 - # 330 

- - TTT AAA GAA CGC GGT GAG AAC GTT GGC TCA TG - #G CGT CAG GCT TGT TAC 
1118 

Phe Lys Glu Arg Gly Glu Asn Val Gly Ser Tr - #p Arg Gin Ala Cys Tyr 
335 - # 340 - # 345 

- - AAG CCA CTT GTG AAC ATC GCT TGT CGT CAT GG - #C TGG GAT ATA GAC GCC 
1166 

Lys Pro Leu Val Asn He Ala Cys Arg His Gl - #y Trp Asp He Asp Ala 
350 3 - #55 3 - #60 3 - 

#65 

- - GTC TTT AAC GCT CAT CCT CGT CTC TCT ATT TG - #G TAT GTT CCA ACA 
AAG 1214 

Val Phe Asn Ala His Pro Arg Leu Ser He Tr - #p Tyr Val Pro Thr Lys 
370 - # 375 - # 380 

- - CTG CGT CAG CTT TGC CAT TTG GAG CGG AAC AA - #T GCG GTT GCT GCG GCT 
1262 

Leu Arg Gin Leu Cys His Leu Glu Arg Asn As - #n Ala Val Ala Ala Ala 
385 - # 390 - # 395 

- - GCG GCT TTA GTT GGC GGT ATT AGC TGT ACC GG - #A TCG TCG ACG TCT GGA 



1310 

Ala Ala Leu Val Gly Gly He Ser Cys Thr Gl - #y Ser Ser Thr Ser Gly 
400 - # 405 - # 410 

- - CGT GGT GGA TGC GGC GGC GAC GAC TTG CGT TT - #C TAGTTTGGTT TGGGTAGTT 
G 1363 Arg Gly Gly Cys Gly Gly Asp Asp Leu Arg Ph - #e 

415 - # 420 

- - TGGTTTGTTT AGTCGTTATC CTAATTAACT ATTAGTCTTT AATTTAGTCT TC - 
#TTGGCTAA 14 2 3 

- - TTTATTTTTC TTTTTTTGTC AAAACCTTTA ATTTGTTATG GCTAATTTGT TA - 
#TACACGCA 14 83 

- - GTTTTCTTAA TGCGTTA - # - # 

- # 1500 - - - - (2) INFORMATION FOR SEQ ID NO: 16: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 424 amino - #acids (B) TYPE: amino ac 

(DJ TOPOLOGY: linear - - (ii) MOLECULE TYPE : protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

- - Met Asp Pro Glu Gly Phe Thr Ser Gly Leu Ph - #e Arg Trp Asn Pro Thr 
1 5 - # 10 - # 15 

- - Arg Ala Leu Val Gin Ala Pro Pro Pro Val Pr - #o Pro Pro Leu Gin Gin 

20 - # 25 - # 30 

- - Gin Pro Val Thr Pro Gin Thr Ala Ala Phe Gl - #y Met Arg Leu Gly Gly 

35 - # 40 - # 45 

- - Leu Glu Gly Leu Phe Gly Pro Tyr Gly He Ar - #g Phe Tyr Thr Ala Ala 
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50 - # 55 - # 60 

- - Lys lie Ala Glu Leu Gly Phe Thr Ala Ser Th - #r Leu Val Gly Met Lys 
65 - # 70 - # 75 - # 80 

- - Asp Glu Glu Leu Glu Glu Met Met Asn Ser Le - #u Ser His lie Phe Arg 

85 - # 90 - # 95 

- - Trp Glu Leu Leu Val Gly Glu Arg Tyr Gly 11 - #e Lys Ala Ala Val Arg 

100 - # 105 - # 110 

- - Ala Glu Arg Arg Arg Leu Gin Glu Glu Glu Gl - #u Glu Glu Ser Ser Arg 

115 - # 120 - # 125 

- - Arg Arg His Leu Leu Leu Ser Ala Ala Gly As - #p Ser Gly Thr His His 

130 - # 135 - # 140 

- - Ala Leu Asp Ala Leu Ser Gin Glu Asp Asp Tr - #p Thr Gly Leu Ser Glu 
145 1 - #50 1 - #55 1 - 

#60 

- - Glu Pro Val Gin Gin Gin Asp Gin Thr Asp Al - #a Ala Gly Asn Asn 
Gly 165 - # 170 - # 175 

- - Gly Gly Gly Ser Gly Tyr Trp Asp Ala Gly Gl - #n Gly Lys Met Lys Lys 

180 - # 185 - # 190 

- - Gin Gin Gin Gin Arg Arg Arg Lys Lys Pro Me - #t Leu Thr Ser Val Glu 

195 - # 200 - # 205 

- - Thr Asp Glu Asp Val Asn Glu Gly Glu Asp As - #p Asp Gly Met Asp Asn 

210 - # 215 - # 220 

- - Gly Asn Gly Gly Ser Gly Leu Gly Thr Glu Ar - #g Gin Arg Glu His Pro 
225 2 - #30 2 - #35 2 - 

#40 

- - Phe He Val Thr Glu Pro Gly Glu Val Ala Ar - #g Gly Lys Lys Asn 
Gly 245 - # 250 - # 255 

- - Leu Asp Tyr Leu Phe His Leu Tyr Glu Gin Cy - #s Arg Glu Phe Leu Leu 

260 - # 265 - # 270 

- - Gin Val Gin Thr He Ala Lys Asp Arg Gly Gl - #u Lys Cys Pro Thr Lys 

275 - # 280 - # 285 

- - Val Thr Asn Gin Val Phe Arg Tyr Ala Lys Ly - #s Ser Gly Ala Ser Tyr 

290 - # 295 - # 300 

- - He Asn Lys Pro Lys Met Arg His Tyr Val Hi - #s Cys Tyr Ala Leu His 
305 3 - #10 3 - #15 3 - 

#20 

- - Cys Leu Asp Glu Glu Ala Ser Asn Ala Leu Ar - #g Arg Ala Phe Lys 
Glu 32S - # 330 - # 335 

- - Arg Gly Glu Asn Val Gly Ser Trp Arg Gin Al - #a Cys Tyr Lys Pro Leu 

340 - # 345 - # 350 

- - Val Asn He Ala Cys Arg His Gly Trp Asp II - #e Asp Ala Val Phe Asn 

355 - # 360 - # 365 

- - Ala His Pro Arg Leu Ser He Trp Tyr Val Pr - #o Thr Lys Leu Arg Gin 

370 - # 375 - # 380 

- - Leu Cys His Leu Glu Arg Asn Asn Ala Val Al - #a Ala Ala Ala Ala Leu 
385 3 - #90 3 - #95 4 - 

#00 

- - Val Gly Gly He Ser Cys Thr Gly Ser Ser Th - #r Ser Gly Arg Gly 
Gly 405 - # 410 - # 415 

- - Cys Gly Gly Asp Asp Leu Arg Phe 420 

- - - - (2) INFORMATION FOR SEQ ID NO: 17: 
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- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1656 base - #pairs 

<B) TYPE: nucleic acid (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear - - (ix) FEATURE : 

(A) NAME / KEY : CDS (B) LOCATION: 1..1651 

- - (ix) FEATURE: (A) NAME / KEY : mi sc. sub.-- - # feature 

(B) LOCATION : 1 . .1656 

(D) OTHER INFORMATION: - #/note= "domain = ecdysone receptor 
ligand bi - Ending domain." 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

- - ATG CGG CCG GAA TGC GTC GTC CCG GAG AAC CA - #A TGT GCG ATG AAG CGG 

48 

Met Arg Pro Glu Cys Val Val Pro Glu Asn Gl - #n Cys Ala Met Lys Arg 
1 5 - # 10 - # 15 

- - CGC GAA AAG AAG GCC CAG AAG GAG AAG GAC AA - #A ATG ACC ACT TCG CCG 

96 

Arg Glu Lys Lys Ala Gin Lys Glu Lys Asp Ly - #s Met Thr Thr Ser Pro 
20 - # 25 - # 30 

- - AGC TCT CAG CAT GGC GGC AAT GGC AGC TTG GC - #C TCT GGT GGC GGC CAA 

144 

Ser Ser Gin His Gly Gly Asn Gly Ser Leu Al - #a Ser Gly Gly Gly Gin 
35 - # 40 - # 45 

- - GAC TTT GTT AAG AAG GAG ATT CTT GAC CTT AT - #G ACA TGC GAG CCG CCC 

192 

Asp Phe Val Lys Lys Glu lie Leu Asp Leu Me - #t Thr Cys Glu Pro Pro 
50 - # ■ 55 - # 60 

- - CAG CAT GCC ACT ATT CCG CTA CTA CCT GAT GA - ffA ATA TTG GCC AAG TGT 

240 

Gin His Ala Thr lie Pro Leu Leu Pro Asp Gl - #u lie Leu Ala Lys Cys 
65 - # 70 - # 75 - # 80 

- - CAA GCG CGC AAT ATA CCT TCC TTA ACG TAC AA - #T CAG TTG GCC GTT ATA 

288 

Gin Ala Arg Asn lie Pro Ser Leu Thr Tyr As - #n Gin Leu Ala Val He 
85 - # 90 - # 95 

- - TAC AAG TTA ATT TGG TAC CAG GAT GGC TAT GA - #G CAG CCA TCT GAA GAG 

336 

Tyr Lys Leu He Trp Tyr Gin Asp Gly Tyr Gl - #u Gin Pro Ser Glu Glu 
100 - # 105 - # HO 

- - GAT CTC AGG CGT ATA ATG AGT CAA CCC GAT GA - #G AAC GAG AGC CAA ACG 

384 

Asp Leu Arg Arg He Met Ser Gin Pro Asp Gl - #u Asn Glu Ser Gin Thr 
115 - # 120 -ft 125 

- - GAC GTC AGC TTT CGG CAT ATA ACC GAG ATA AC - #C ATA CTC ACG GTC CAG 

432 

Asp Val Ser Phe Arg His He Thr Glu He Th - #r He Leu Thr Val Gin 
130 - # 135 - # 140 

- - TTG ATT GTT GAG TTT GCT AAA GGT CTA CCA GC - #G TTT ACA AAG ATA CCC 

480 

Leu He Val Glu Phe Ala Lys Gly Leu Pro Al - #a Phe Thr Lys He Pro 
145 1 - #50 1 - #55 1 - 

#60 
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- - CAG GAG GAC CAG ATC ACG TTA CTA AAG GCC TG - #C TCG TCG GAG GTG 
ATG 52 8 

Gin Glu Asp Gin lie Thr Leu Leu Lys Ala Cy - #s Ser Ser Glu Val Met 
165 - # 170 - # 175 

- - ATG CTG CGT ATG GCA CGA CGC TAT GAC CAC AG - #C TCG GAC TCA ATA TTC 

576 

Met Leu Arg Met Ala Arg Arg Tyr Asp His Se - #r Ser Asp Ser lie Phe 
180 - # 185 - # 190 

- - TTC GCG AAT AAT AGA TCA TAT ACG CGG GAT TC - #T TAC AAA ATG GCC GGA 

624 

Phe Ala Asn Asn Arg Ser Tyr Thr Arg Asp Se - #r Tyr Lys Met Ala Gly 
195 - # 200 - # 205 

- - ATG GCT GAT AAC ATT GAA GAC CTG CTG CAT TT - ffC TGC CGC CAA ATG TTC 

672 

Met Ala Asp Asn lie Glu Asp Leu Leu His Ph - #e Cys Arg Gin Met Phe 
210 - # 215 - # 220 

- - TCG ATG AAG GTG GAC AAC GTC GAA TAC GCG CT - #T CTC ACT GCC ATT GTG 

720 

Ser Met Lys Val Asp Asn Val Glu Tyr Ala Le - #u Leu Thr Ala lie Val 
225 2 - #30 2 - #35 2 - 

#40 

- - ATC TTC TCG GAC CGG CCG GGC CTG GAG AAG GC - #C CAA CTA GTC GAA 
GCG 768 

He Phe Ser Asp Arg Pro Gly Leu Glu Lys Al - #a Gin Leu Val Glu Ala 
245 - # 250 - # 255 

- - ATC CAG AGC TAC TAC ATC GAC ACG CTA CGC AT - #T TAT ATA CTC AAC CGC 

816 

He Gin Ser Tyr Tyr He Asp Thr Leu Arg II - #e Tyr He Leu Asn Arg 
260 - # 265 - # 270 

- - CAC TGC GGC GAC TCA ATG AGC CTC GTC TTC TA - #C GCA AAG CTG CTC TCG 

864 

His Cys Gly Asp Ser Met Ser Leu Val Phe Ty - #r Ala Lys Leu Leu Ser 
275 - # 280 - # 285 

- - ATC CTC ACC GAG CTG CGT ACG CTG GGC AAC CA - #G AAC GCC GAG ATG TGT 

912 

He Leu Thr Glu Leu Arg Thr Leu Gly Asn Gl - #n Asn Ala Glu Met Cys 
290 - # 295 - # 300 

- - TTC TCA CTA AAG CTC AAA AAC CGC AAA CTG CC - #C AAG TTC CTC GAG GAG 

960 

Phe Ser Leu Lys Leu Lys Asn Arg Lys Leu Pr - #o Lys Phe Leu Glu Glu 
305 3 - #10 3 - #15 3 - 

#20 

- - ATC TGG GAC GTT CAT GCC ATC CCG CCA TCG GT - #C CAG TCG CAC CTT 
CAG 1008 

He Trp Asp Val His Ala He Pro Pro Ser Va - #1 Gin Ser His Leu Gin 
325 - # 330 - # 335 

- - ATT ACC CAG GAG GAG AAC GAG CGT CTC GAG CG - #G GCT GAG CGT ATG CGG 
1056 

He Thr Gin Glu Glu Asn Glu Arg Leu Glu Ar - ffg Ala Glu Arg Met Arg 
340 - # 345 - # 350 

- - GCA TCG GTT GGG GGC GCC ATT ACC GCC GGC AT - #T GAT TGC GAC TCT GCC 
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1104 

Ala Ser Val Gly Gly Ala He Thr Ala Gly 11 - #e Asp Cys Asp Ser Ala 
355 - # 360 - # 365 

- - TCC ACT TCG GCG GCG GCA GCC GCG GCC CAG CA - #T CAG CCT CAG CCT CAG 
5 1152 

Ser Thr Ser Ala Ala Ala Ala Ala Ala Gin Hi - ffs Gin Pro Gin Pro Gin 
370 - # 375 - # 380 

- - CCC CAG CCC CAA CCC TCC TCC CTG ACC CAG AA - #C GAT TCC CAG CAC CAG 
1200 

10 Pro Gin Pro Gin Pro Ser Ser Leu Thr Gin As - #n Asp Ser Gin His Gin 

385 3 - #90 3 - #95 4 - 

#00 

- - ACA CAG CCG CAG CTA CAA CCT CAG CTA CCA CC - #T CAG CTG CAA GGT 
CAA 1243 

15 Thr Gin Pro Gin Leu Gin Pro Gin Leu Pro Pr - #o Gin Leu Gin Gly Gin 

405 - # 410 - # 415 

- - CTG CAA CCC CAG CTC CAA CCA CAG CTT CAG AC - #G CAA CTC CAG CCA CAG 

1296 

Leu Gin Pro Gin Leu Gin Pro Gin Leu Gin Th - #r Gin Leu Gin Pro Gin 
420 - # 425 - # 430 

?r? 

- - ATT CAA CCA CAG CCA CAG CTC CTT CCC GTC TC - #C GCT CCC GTG CCC GCC 

T.\ 1344 

"f- He Gin Pro Gin Pro Gin Leu Leu Pro Val Se - #r Ala Pro Val Pro Ala 

IH 435 - # 440 - # 445 

f'%S - - TCC GTA ACC GCA CCT GGT TCC TTG TCC GCG GT - #C AGT ACG AGC AGC GAA 

S 1392 

O Ser Val Thr Ala Pro Gly Ser Leu Ser Ala Va - #1 Ser Thr Ser Ser Glu 

Is; 450 - # 455 - # 460 

O - - TAC ATG GGC GGA AGT GCG GCC ATA GGA CCC AT - #C ACG CCG GCA ACC ACC 

1440 

J Tyr Met Gly Gly Ser Ala Ala He Gly Pro II - #e Thr Pro Ala Thr Thr 

b 465 4 - #70 4 - #75 4 - 

#80 

- - AGC AGT ATC ACG GCT GCC GTT ACC GCT AGC TC - #C ACC ACA TCA GCG 
35 GTA 1488 

Ser Ser He Thr Ala Ala Val Thr Ala Ser Se - #r Thr Thr Ser Ala Val 
485 - # 490 - # 495 

- - CCG ATG GGC AAC GGA GTT GGA GTC GGT GTT GG - #G GTG GGC GGC AAC GTC 
1536 

40 Pro Met Gly Asn Gly Val Gly Val Gly Val Gl - #y Val Gly Gly Asn Val 

500 - # 505 - # 510 

- - AGC ATG TAT GCG AAC GCC CAG ACG GCG ATG GC - #C TTG ATG GGT GTA GCC 
1584 

Ser Met Tyr Ala Asn Ala Gin Thr Ala Met Al - #a Leu Met Gly Val Ala 
45 515 - # 520 - # 525 

- - CTG CAT TCG CAC CAA GAG CAG CTT ATC GGG GG - #A GTG GCG GTT AAG TCG 
1632 

Leu His Ser His Gin Glu Gin Leu He Gly Gl - #y Val Ala Val Lys Ser 
530 - # 535 - # 540 

50 - - GAG CAC TCG ACG ACT GCA T AGCAG - # - # 

16 56 Glu His Ser Thr Thr Ala 54 5 5 - #50 
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- - - ~ (2) INFORMATION FOR SEQ ID NO:18: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 amino - #acids (B) TYPE: amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

5 - - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

- - Met Arg Pro Glu Cys Val Val Pro Glu Asn Gl - #n Cys Ala Met Lys Arg 
1 5 - # 10 - # 15 

- - Arg Glu Lys Lys Ala Gin Lys Glu Lys Asp Ly - #s Met Thr Thr Ser Pro 

20 - # 25 - # 30 

10 - - Ser Ser Gin His Gly Gly Asn Gly Ser Leu Al - #a Ser Gly Gly Gly Gin 

35 - # 40 - # 45 

- - Asp Phe Val Lys Lys Glu He Leu Asp Leu Me - #t Thr Cys Glu Pro Pro 

50 - # 55 - # 60 

15 - - Gin His Ala Thr He Pro Leu Leu Pro Asp Gl - #u He Leu Ala Lys Cys 

65 - # 70 - # 75 - # 80 

- - Gin Ala Arg Asn He Pro Ser Leu Thr Tyr As - #n Gin Leu Ala Val He 

85 - # 90 - # 95 

~-=j ~ - Tyr Lys Leu lie Trp Tyr Gin Asp Gly Tyr Gl - #u Gin Pro Ser Glu Glu 

jgO 100 - # 105 - # 110 

- - Asp Leu Arg Arg He Met Ser Gin Pro Asp Gl - #u Asn Glu Ser Gin Thr 
*'! 115 - # 120 - # 125 

- - Asp Val Ser Phe Arg His He Thr Glu He Th - #r lie Leu Thr Val Gin 
|*f] 130 - # 135 - # 140 

H£5 - - Leu He Val Glu Phe Ala Lys Gly Leu Pro Al - #a Phe Thr Lys He Pro 

E 145 1 - #50 1 - #55 1 - 

y I - - Gin Glu Asp Gin He Thr Leu Leu Lys Ala Cy - #s Ser Ser Glu Val 

O Met 165 - # 170 - # 175 

U30 - - Met Leu Arg Met Ala Arg Arg Tyr Asp His Se - #r Ser Asp Ser He Phe 

i** 180 - # 185 - # 190 

= s= - - Phe Ala Asn Asn Arg Ser Tyr Thr Arg Asp Se - #r Tyr Lys Met Ala Gly 

195 - # 200 - # 205 

- - Met Ala Asp Asn He Glu Asp Leu Leu His Ph - #e Cys Arg Gin Met Phe 
35 210 - # 215 - # 220 

- - Ser Met Lys Val Asp Asn Val Glu Tyr Ala Le - #u Leu Thr Ala He Val 
225 2 - #30 2 - #35' 2 - 

#40 

- - He Phe Ser Asp Arg Pro Gly Leu Glu Lys Al - #a Gin Leu Val Glu 
40 Ala 245 - # 250 - # 255 

- - He Gin Ser Tyr Tyr He Asp Thr Leu Arg H - #e Tyr He Leu Asn Arg 

260 - # 265 - # 270 

- - His Cys Gly Asp Ser Met Ser Leu Val Phe Ty - #r Ala Lys Leu Leu Ser 

275 - # 280 - # 285 

45 - - He Leu Thr Glu Leu Arg Thr Leu Gly Asn Gl - #n Asn Ala Glu Met Cys 

290 - # 295 - # 300 

- - Phe Ser Leu Lys Leu Lys Asn Arg Lys Leu Pr - #0 Lys Phe Leu Glu Glu 
305 3 - #10 3 - #15 3 - 

#20 

50 - - He Trp Asp Val His Ala He Pro Pro Ser Va - #1 Gin Ser His Leu 

Gin 325 - # 330 - # 335 
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- - lie Thr Gin Glu Glu Asn Glu Arg Leu Glu Ar - #9 Ala Glu Arg Met Arg 

340 - # 345 - # 350 

- - Ala Ser Val Gly Gly Ala He Thr Ala Gly 11 - #e Asp Cys Asp Ser Ala 

355 -ft 360 - U 365 

- - Ser Thr Ser Ala Ala Ala Ala Ala Ala Gin Hi - #s Gin Pro Gin Pro Gin 

370 - ft 375 - # 380 

- - Pro Gin Pro Gin Pro Ser Ser Leu Thr Gin As - #n Asp Ser Gin His Gin 
385 3 - #90 3 - #95 4 - 

#00 

- - Thr Gin Pro Gin Leu Gin Pro Gin Leu Pro Pr - #0 Gin Leu Gin Gly 
Gin 405 - # 410 - # 415 

- - Leu Gin Pro Gin Leu Gin Pro Gin Leu Gin Th - #r Gin Leu Gin Pro Gin 

420 - # 425 - # 430 

- - He Gin Pro Gin Pro Gin Leu Leu Pro Val Se - #r Ala Pro Val Pro Ala 

435 - # 440 - # 445 

- - Ser Val Thr Ala Pro Gly Ser Leu Ser Ala Va - #1 Ser Thr Ser Ser Glu 

450 - # 455 - # 460 

- - Tyr Met Gly Gly Ser Ala Ala He Gly Pro II - #e Thr Pro Ala Thr Thr 
465 4 - #70 4 - #75 4 - 

#80 

- - Ser Ser lie Thr Ala Ala Val Thr Ala Ser Se - #r Thr Thr Ser Ala 
Val 485 - # 490 - # 495 

- - Pro Met Gly Asn Gly Val Gly Val Gly Val Gl - tty Val Gly Gly Asn Val 

500 - # 505 - # 510 

- - Ser Met Tyr Ala Asn Ala Gin Thr Ala Met Al - #a Leu Met Gly Val Ala 

515 - # 520 - # 525 

- - Leu His Ser His Gin Glu Gin Leu He Gly Gl - #y Val Ala Val Lys Ser 

530 - # 535 - # 540 

- - Glu His Ser Thr Thr Ala 545 5 - #50 

- - ~ - (2) INFORMATION FOR SEQ ID NO: 19: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 855 base - #pairs 

(B) TYPE: nucleic acid (C) STRANDEDNESS : double 
(D) TOPOLOGY: linear - - (ix) FEATURE: 

(A) NAME / KEY : CDS (B) LOCATION: 1..853 

- - (ix) FEATURE: (A) NAME /KEY : misc. sub.-- - #feature 

(B) LOCATION: 1. .855 

(D) OTHER INFORMATION; - #/note= "domain = glucocorticoid 
receptor - #ligand binding domain." 

- - {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

- - ACA AAG AAA AAA ATC AAA GGG ATT CAG CAA GC - #C ACT GCA GGA GTC TCA 

48 

Thr Lys Lys Lys He Lys Gly He Gin Gin Al - #a Thr Ala Gly Val Ser 
1 5 - # 10 - # 15 

- - CAA GAC ACT TCG GAA AAT CCT AAC AAA ACA AT - #A GTT CCT GCA GCA TTA 

96 

Gin Asp Thr Ser Glu Asn Pro Asn Lys Thr II - #e Val Pro Ala Ala Leu 
20 - # 25 - # 30 

- - CCA CAG CTC ACC CCT ACC TTG GTG TCA CTG CT - #G GAG GTG ATT GAA CCC 

144 

Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Le - #u Glu Val He Glu Pro 
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- - GAG GTG TTG TAT GCA GGA TAT GAT AGC TCT GT - #T CCA GAT TCA GCA TGG 

192 

Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Va - #1 Pro Asp Ser Ala Trp 
50 - # 55 - # 60 

- - AGA ATT ATG ACC ACA CTC AAC ATG TTA GGT GG - #G CGT CAA GTG ATT GCA 

240 

Arg lie Met Thr Thr Leu Asn Met Leu Gly Gl - #y Arg Gin Val He Ala 
65 - # 70 - # 75 - # 80 

- - GCA GTG AAA TGG GCA AAG GCG ATA CTA GGC TT - #G AGA AAC TTA CAC CTC 

288 

Ala Val Lys Trp Ala Lys Ala He Leu Gly Le - #u Arg Asn Leu His Leu 
85 - # 90 - # 95 

- - GAT GAC CAA ATG ACC CTG CTA CAG TAC TCA TG - #G ATG TTT CTC ATG GCA 

336 

Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Tr - #p Met Phe Leu Met Ala 
100 - # 105 - # 110 

- - TTT GCC TTG GGT TGG AGA TCA TAC AGA CAA TC - #A AGC GGA AAC CTG CTC 

384 

Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Se - #r Ser Gly Asn Leu Leu 
115 - # 120 - # 125 

- - TGC TTT GCT CCT GAT CTG ATT ATT AAT GAG CA - #G AGA ATG TCT CTA CCC 

432 

Cys Phe Ala Pro Asp Leu He He Asn Glu Gl - #n Arg Met Ser Leu Pro 
130 - # 135 - # 140 

- - TGC ATG TAT GAC CAA TGT AAA CAC ATG CTG TT - #T GTC TCC TCT GAA TTA 

480 

Cys Met Tyr Asp Gin Cys Lys His Met Leu Ph - #e Val Ser Ser Glu Leu 
145 1 - #50 1 - #55 1 - 

#60 

- - CAA AGA TTG CAG GTA TCC TAT GAA GAG TAT CT - #C TGT ATG AAA ACC 
TTA 528 

Gin Arg Leu Gin Val Ser Tyr Glu Glu Tyr Le - #u Cys Met Lys Thr Leu 
165 - # 170 - # 175 

- - CTG CTT CTC TCC TCA GTT GCT AAG GAA GGT CT - #G AAG AGC CAA GAG TTA 

576 

Leu Leu Leu Ser Ser Val Ala Lys Glu Gly Le - #u Lys Ser Gin Glu Leu 
180 - # 185 - # 190 

- - TTT GAT GAG ATT CGA ATG ACT TAT ATC AAA GA - #G CTA GGA AAA GCC ATC 

624 

Phe Asp Glu He Arg Met Thr Tyr He Lys Gl - #u Leu Gly Lys Ala He 
195 - # 200 - # 205 

- - GTC AAA AGG GAA GGG AAC TCC AGT CAG AAC TG - #G CAA CGG TTT TAC CAA 

672 

Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Tr - #p Gin Arg Phe Tyr Gin 
210 - # 215 - # 220 

- - CTG ACA AAG CTT CTG GAC TCC ATG CAT GAG GT - #G GTT GAG AAT CTC CTT 

720 

Leu Thr Lys Leu Leu Asp Ser Met His Glu Va - #1 Val Glu Asn Leu Leu 
225 2 - #30 2 - #35 2 - 

#40 
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- - ACC TAC TGC TTC CAG ACA TTT TTG GAT AAG AC - #C ATG AGT ATT GAA 
TTC 76 8 

Thr Tyr Cys Phe Gin Thr Phe Leu Asp Lys Th - #r Met Ser lie Glu Phe 
245 - # 250 - # 25S 

- - CCA GAG ATG TTA GCT GAA ATC ATC ACT AAT CA - #G ATA CCA AAA TAT TCA 

816 

Pro Glu Met Leu Ala Glu He He Thr Asn Gl - #n He Pro Lys Tyr Ser 
260 - # 265 - # 270 

- - AAT GGA AAT ATC AAA AAG CTT CTG TTT CAT CA - #A AAA T GA 

- # 855 Asn Gly Asn He Lys Lys Leu Leu Phe His Gl - #n Lys 

275 - # 280 

- - - - (2) INFORMATION FOR SEQ ID NO: 20: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino - #acids (B) TYPE: amino acid 

(D) TOPOLOGY: linear - - (ii) MOLECULE TYPE: protein 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

- - Thr Lys Lys Lys He Lys Gly He Gin Gin Al - #a Thr Ala Gly Val Ser 
1 5 - # 10 - # 15 

- - Gin Asp Thr Ser Glu Asn Pro Asn Lys Thr 11 - #e Val Pro Ala Ala Leu 

20 - # 25 - # 30 

- - Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Le - #u Glu Val He Glu Pro 

35 - # 40 - # 45 

- - Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Va - #1 Pro Asp Ser Ala Trp 

50 - # 55 - # 60 

- - Arg He Met Thr Thr Leu Asn Met Leu Gly Gl - #y Arg Gin Val He Ala 
65 - # 70 - # 75 - # 80 

- - Ala Val Lys Trp Ala Lys Ala He Leu Gly Le - #u Arg Asn Leu His Leu 

85 - # 90 - # 95 

- - Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Tr - #p Met Phe Leu Met Ala 

100 - # 105 - # HO 

- - Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Se - #r Ser Gly Asn Leu Leu 

115 - # 120 - # 125 

- - Cys Phe Ala Pro Asp Leu He He Asn Glu Gl - #n Arg Met Ser Leu Pro 

130 - # 135 - # 140 

- - Cys Met Tyr Asp Gin Cys Lys His Met Leu Ph - #e Val Ser Ser Glu Leu 
145 1 - #50 1 - #55 1 - 

#60 

- - Gin Arg Leu Gin Val Ser Tyr Glu Glu Tyr Le - #u Cys Met Lys Thr 
Leu 165 - # 170 - # 175 

- - Leu Leu Leu Ser Ser Val Ala Lys Glu Gly Le - #u Lys Ser Gin Glu Leu 

180 - # 185 - # 190 

- - Phe Asp Glu He Arg Met Thr Tyr He Lys Gl - #u Leu Gly Lys Ala He 

195 - # 200 - # 205 

- - Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Tr - #p Gin Arg Phe Tyr Gin 

210 - # 215 - # 220 

- - Leu Thr Lys Leu Leu Asp Ser Met His Glu Va - #1 Val Glu Asn Leu Leu 
225 2 - #30 2 - #35 2 - 

#4 0 

- - Thr Tyr Cys Phe Gin Thr Phe Leu Asp Lys Th - #r Met Ser He Glu 
Phe 245 - # 250 - # 255 

- - Pro Glu Met Leu Ala Glu He He Thr Asn Gl - #n He Pro Lys Tyr Ser 
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260 - # 265 - # 270 

- - Asn Gly Asn He Lys Lys Leu Leu Phe His Gl - #n Lys 

275 - # 280 

- - - - (2) INFORMATION FOR SEQ ID NO: 21: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base - #pairs (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double (D) TOPOLOGY: linear 

- - (ix) FEATURE: (A) NAME/KEY: misc. sub.-- - ^feature 

(B) LOCATION: 1 . .50 

(D) OTHER INFORMATION: - #/note= "element = copper inducible 

regulatory - ^element (ACE1 binding site) . " 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

- - AG CTTAGCGA TGCGTCTTTT CCGCTGAACC GTTCCAGCAA AAAAGACTAG - # 50 

- - - - (2) INFORMATION FOR SEQ ID NO: 22: 

- - (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 19 base - #pairs (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double (D) TOPOLOGY: linear 

- - (ix) FEATURE: (A) NAME / KEY : misc. sub.-- - #feature 

(B) LOCATION: 1 . . 19 

(D) OTHER INFORMATION: - #/note= "element = tet operator." 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

- - ACT CTATCAG TGATAGAGT - tf - # 

- # 19 - - - - {2) INFORMATION FOR SEQ ID NO: 23: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base - #pairs (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double (D) TOPOLOGY: linear 

- - (ix) FEATURE: (A) NAME / KEY : misc. sub.-- - # feature 

(B) LOCATION: 1 . . 29 

(D) OTHER INFORMATION: - #/note= "element = ecdysone response 

element . " 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

- - GATCCGACAA GGGTTCAATG CACTTGTCA - # - # 

29 - - - - (2) INFORMATION FOR SEQ ID NO:24: 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 base - ffpairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double (D) TOPOLOGY: linear 

- - (ix) FEATURE: (A) NAME/ KEY : misc. sub.-- - #feature 

(B) LOCATION: 1. .371 

(D) OTHER INFORMATION: - #/note= "element = heat shock 

inducible - #regulatory element (HSPS1-1 promoter) . " 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 24 : 

- - GTGGAGTCTC GAAACGAAAA GAACTTTCTG GAATTCGTTT GCT CACAAAG CT - 
#AAAAACGG 6 0 

- - TTGATTTCAT CGAAATACGG CGTCGTTTTC AAAGAACAAT CCAGAAATCA CT - 
ffGGTTTTCC 12 0 

- - TTTATTTCAA AAGAAGAGAC TAGAACTTTA TTTCTCCTCT ATAAAATCAC TT - 
#TGTTTTTC 180 

- - CCTCTCTTCT TCATAAATCA ACAAAACAAT CACAAATCTC TCGAAACGCT CT - 
JfCGAAGTTC 24 0 
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(B) TYPE: nucleic acid 
(D) TOPOLOGY: linear 



- - CAAATTTTCT CTTAGCATTC TCTTTCGTTT CTCGTTTGCG TTGAATCAAA GT - 
#TCGTTGCG 3 00 

- - ATGGCGGATG TTCAGATGGC TGATGCAGAG ACTTTTGCTT TCCAAGCTGA GA - 
#TTAACCAG 360 

- - CTTCTTAGCT T - # - # 

- - - - {2) INFORMATION FOR SEQ ID NO: 25: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 2 9 base - ttpairs 
(C) STRAND EDNESS : single 

- - {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

- - GGATCCGGAT CAAAAATGGG AAGGGGTAG - # 

29 - - - - (2) INFORMATION FOR SEQ ID NO: 26 

- - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base - #pairs 
(C) STRANDEDNESS : single 

- - (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 

- - GGATCCGCTG CGGCGAAGCA GCCAAGGTTG 

30 



(B) TYPE: 
(D) TOPOLOGY: 



nucleic acid 
linear 



SEQ ID NO: 27 and SEQ ID NO: 28 

Arabidopsis SEP1 cDNA and Arabidopsis SEP1 amino acid sequence 

20 40 60 80 100 

ATGGGAAGAG GAAGAGTAGA GCTGAAGAGG ATAGAGAACA AAATCAACAG ACAAGTAACG TTTGCAAAGC GTAGGAACGG TTTGTTGAAG AAAGCTTATG 
TACCCTTCTC CTTCTCATCT CGACTTCTCC TATCTCTTGT TTTAGTTGTC TGTTCATTGC AAACGTTTCG CATCCTTGCC AAACAACTTC TTTCGAATAC 
MGR G R V E LKH IEN KINR QVT FAX R R N G L L X KAY> 



120 140 160 160 200 

AATTGTCTGT TCTCTGTGAT GCTGAAGTTG CTCTCATCAT CTTCTCCAAC CGTGGAAAGC TCTATGAGTT TTGCAGCTCC TCAAACATGC TCAAGACACT 
TTAACAGACA AGAGACACTA CGACTTCAAC GAGAGTAGTA GAAGAGGTTG GCACCTTTCG AGATACTCAA AACGTCGAGG AGTTTGTACG AGTTCTGTGA 
ELSV LCD AEV ALII FSN RGK LYEF CSS SNM LKTL> 



220 240 260 280 300 

TGATCGGTAC CAGAAATGCA GCTATGGATC CATTGAAGTC AACAACAAAC CTGCCAAAGA ACTTGAGAAC AGCTACAGAG AATATCTGAA GCTTAAGGGT 
ACTAGCCATG GTCTTTACGT CGATACCTAG GTAACTTCAG TTGTTGTTTG GACGGTTTCT TGAACTCTTG tcgatgtctc TTATAGACTT CGAATTCCCA 
DRY QKC SYGS IEV NNK PAKE LEH SYR EYLK LKG> 



320 

AGATATGAGA ACCTTCAACG 
TCTATACTCT TGGAAGTTGC 
RYE N 1, Q R 

420 

GCTCTCTCAA GCAAGTTCGG 
CGAGAGAGTT CGTTCAAGCC 
G S L K Q V R 



340 

TCAACAGAGA AATCTTCTTG 
AGTTGTCTCT TTAGAAGAAC 
0 Q R W L L 

44 0 

TCCATCAAGA CACAGTACAT 
AGGTAGTTCT GTGTCATGTA 
S I K T Q Y M 



360 

GGGAGGATTT AGGACCTTTG 
CCCTCCTAAA TCCTGGAAAC 
G E D L GPL 

460 

GCTTGACCAG CTCTCGGATC 
CGAACTGGTC GAGAGCCTAG 
L D Q LSD 



380 

AATTCAAAGC AGTTAGAGCA 
TTAAGTTTCC TCAATCTCGT 
N S X E L E Q 

480 

TTCAAAATAA AGAGCAAATG 
AAGT1TTATT TCTCGTTTAC 
L Q N K E Q H 



400 

GCTTGAGCGT CAACTGGACG 
CGAACTCGCA GTTGACCTGC 
L E R Q L D> 

500 

TTGCTTGAAA CCAATAGAGC 
AACGAACTTT GGTTATCTCG 
L L E T N R A> 



520 540 560 580 600 

TTTGGCAATG AAGCTGGATG ATATGATTGG TGTGAGAAGT CATCATATGG GAGGATGGGA AGGCGGTGAA CAGAATGTTA CCTACGCGCA TCATCAAGCT 
AAACCGTTAC TTCGACCTAC TATACTAACC ACACTCTTCA GTAGTATACC CTCCTACCCT TCCGCCACTT GTCTTACAAT GGATGCGCGT AGTAGTTCGA 
LAM KLD DHIG VRS HHH GGWE GGE QNV TYAH HQA> 



620 640 660 680 700 

CAGTCTCAGG GACTATACCA GCCTCTTGAA TGCAATCCAA CTCTGCAAAT GGGGTATGAT AATCCAGTAT GCTCTGAGCA AATCACTGCG ACAACACAAG 
GTCAGAGTCC CTGATATGGT CGGAGAACTT ACGTTAGGTT GAGACGTTTA CCCCATACTA TTAGGTCATA CGAGACTCGT TTAGTGACGC TGTTGTGTTC 
Q S 0 GLYO PLE CUP TLQM GYD M P V CSEQ I T A TTQ> 

720 740 
CTCAGGCGCA GCCGGGAAAC GGTTACATTC CAGGATGGAT GCTCTGA 
GAGTC03CGT CGGCCCTTTG CCAATCTAAG GTCCTACCTA CGAGACT 
AQAQ PGN GYI PGWM L*> 
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SEQ ID NO: 29 and SEQ. ID NO: 30 

Arabidopsis SEP2 cDNA and Arabidopsis SEP2 amino acid sequence 

20 40 60 ao 100 

ATGGGAAGAG GAAGAGTAGA GCTCAAGAGG ATAGAGAACA AAATCAACAG ACAAGTGACG TTTGCTAAAC GTAGAAATGG TTTGCTGAAA AAAGCTTATG 
TACCCTTCTC CTTCTCATCT CGAGTTCTCC TATCTCTTGT TTTAGTTGTC TGTTCACTGC AAACGATTTG CATCTTTACC AAACGACTTT TTTCGAATAC 
MGR GRVE LKR IEN KIKR QVT FAK RRNG LLK K A Y> 

120 140 160 180 200 

AGCTTTCTGT TCTCTGCGAT GCTGAAGTCT CTCTCATCGT CTTCTCCAAC CGTGGCAAGC TCTACGAGTT CTGCAGCACC TCCAACATGC TCAAGACACT 

TCGAAAGACA AGAGACGCTA CGACTTCAGA GAGAGTAGCA GAAGAGGTTG GCACCGTTCG AGATGCTCAA GACGTCGTGG AGGTTGTACG AGTTCTGTGA 

ELSV LCD AEV SLIV FSN RGK LYEF CST SNM L K T L> 

220 240 260 280 300 

GGAAAGGTAT CAGAAGTGTA GCTATGGCTC CATTGAAGTC AACAACAAAC CTGCTAAAGA GCTTGAGAAC AGCTACAGAG AGTACTTGAA GCTGAAAGGT 

CCTTTCCATA GTCTTCACAT CGATACCGAG GTAACTTCAG TTGTTGTTTG GACGATTTCT CGAACTCTTG TCGATGTCTC TCATGAACTT CGACTTTCCA 

E R Y QKC SYGS I S V JJ M K PAKE LEN SYR EYLK LKG> 

320 340 360 380 400 

AGATATGAAA ATCTGCAACG TCAGCAGAGA AATCTTCTTG GAGAGGATCT TGGACCTCTG AATTCAAAGG AGCTAGAGCA GCTTGAGCGT CAACTAGACG 

TCTATACITT TAGACGTTGC AGTCGTCTCT TTAGAAGAAC CTCTCCTAGA ACCTGGAGAC TTAAGTTTCC TCGATCTCGT CGAACTCGCA GTTGATCTGC 

RYE NLQR QQR NLL GEDL G P I. NSK ELEQ L E R Q L D> 

420 440 460 480 500 

GCTCTCTGAA GCAAGTTCGC TGCATCAAGA CACAGTATAT GCTTGACCAG CTCTCTGATC TTCAAGGTAA GGAGCATATC TTGCTTGATG CCAACAGAGC 
CGAGAGACTT CGTTCAAGCG ACGTAGTTCT GTGTCATATA CGAACTGGTC GAGAGACTAG AAGTTCCATT CCTCGTATAG AACGAACTAC GGTTGTCTCG 
G S L K QVR CIK TQYM L D Q LSD LQGK EHI L L V A N R A> 

520 540 550 580 600 

TTTGTCAATG AAGCTGGAAG ATATGATCGG CGTGAGACAT CACCATATAG GAGGAGGATG GGAAGGTGGT GATCAACAGA ATATTGCCTA TGGACATCCT 
AAACAGTTAC TTCGACCTTC TATACTAGCC GCACTCTGTA GTGGTATATC CTCCTCCTAC CCTTCCACCA CTAGTTGTCT TATAACGGAT ACCTGTAGGA 
LSM KLE DMIG VRH HHI GGGW EGG DQQ NIAY GHP> 

S20 640 660 630 700 

CAGGCTCATT CTCAGGGACT ATACCAATCT CTTGAATGTG ATCCCACTTT GCAAATTGGA TATAGCCATC CAGTGTGCTC AGAGCAAATG GCTGTGACGG 

GTCCGAGTAA GAGTCCCTGA TATGGTTAGA GAACTTACAC TAGGGTGAAA CGTTTAACCT ATATCGGTAG GTCACACGAG TCTCGTTTAC CGACACTGCC 

QAH SQGL Y Q S L E C DPTL QIG Y S H PVCS EQM AVT> 

720 740 
TGCAAGGTCA GTCCCAACAA GGAAACGGCT ACATCCCTGG CTGGATGCTG TGA 
ACGTTCCAGT CAGGGTTGTT CCTTTGCCGR TGTAGGGACC GACCTACGAC ACT 
VQGQ SQQ GNG YIPG WML 

SEQ ID NO: 31 and SEQ ID NO: 32 

Arabidopsis SEP3 cDNA and Arabidopsis SEP3 amino acid sequence 

20 40 60 80 100 

ATGGGAAGAG GGAGAGTAGA ATTGAAGAGG ATAGAGAACA AGATCAATAG GCAAGTGACG TTTGCAAAGA GAAGGAATGG TCTTTTGAAG AAAGCATACG 
TACCCTTCTC CCTCTCATCT TAACTTCTCC TATCTCTTGT TCTAGTTATC CGTTCACTGC AAACGTTTCT CTTCCTTACC AGAAAACTTC TTTCGTATGC 
MGR GRVE LKR IEN K I N R QVT FAK RRNG LLK K A Y> 

120 140 160 180 200 

AGCTTTCAGT TCTATGTGAT GCAGAAGTTG CTCTCATCAT CTTCTCAAAT AGAGGAAAGC TGTACGAGTT TTGCAGTAGT TCGAGCATGC TTCGGACACT 
TCGAAAGTCA AGATACACTA CGTCTTCAAC GAGAGTAGTA GAAGAGTTTA TCTCCTTTCG ACATGCTCAA AACGTCATCA AGCTCGTACG AAGCCTGTGA 
ELSV LCD AEV ALII FSN RGK LYEF CSS SSM LRTLa 

220 240 260 280 300 

GGAGAGGTAC CAAAAGTGTA ACTATGGAGC ACCAGAACCC AATGTGCCTT CAAGAGAGGC CTTAGCAGTT GAACTTAGTA GCCAGCAGGA GTATCTCAAG 
CCTCTCCATG GTTTTCACAT TGATACCTCG TGGTCTTGGG TTACACGGAA GTTCTCTCCG GAATCGTCAA CTTGAATCAT CGGTCGTCCT CATAGAGTTC 
ERY QKC NYGA PEP NVP SREA LAV ELS SQQE Y L K> 

320 340 360 380 400 

CTTAAGGAGC GTTATGACGC CTTACAAAGA ACCCAAAGGA ATCTGTTGGG AGAAGATCTT GGACCTCTAA GTACAAAGGA GCTTGAGTCA CTTGAGAGAC 

GAATTCCTCG CAATACTGCG GAATGTTTCT TGGGTTTCCT TAGACAACCC TCTTCTAGAA CCTGGAGATT CATGTTTCCT CGAACTCAGT GAACTCTCTG 

L K E RYDA L Q R TOR NLLG E D L GPL STKE LES LER> 

420 440 460 480 500 

AGCTTGATTC TTCCTTGAAG CAGATCAGAG CTCTCAGGAC ACAGTTTATG CTTGACCAGC TCAACGATCT TCAGAGTAAG TTAGCTGATG GGTATCAGAT 
TCGAACTAAG AAGGAACTTC GTCTAGTCTC GAGAGTCCTG TGTCAAATAC GAACTGGTCG AGTTGCTAGA AGTCTCATTC AATCGACTAC CCATAGTCTA 
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QLDS SLK Q 1 R ALRT QFM LDQ LWDL Q S K LAD GVQM> 

520 540 560 SSO fiOO 

GCCACTCCAG CTGAACCCTA ACCAAGAAGA GGTTGATCAC TACGGTCGTC ATCATCATCA ACAACAACAA CACTCCCAAG CTTTCTTCCA GCCTTTGGAA 
CGGTGAGGTC GACTTGGGAT TGGTTCTTCT CCAACTAGTG ATGCCAGCAG TAGTAGTAGT TGTTGTTGTT GTGAGGGTTC GAAAGAAGGT CGGAAACCTT 
P L Q L N P NQEE VDH YGH HHHQ QQQ HSQ AFFQ P L E> 

620 640 660 680 700 

TGTQAACCCA TTCTTCAGAT CGGGTATCAG GGGCAGCAAG atggaatggg agcaggacca agtgtgaata attacatgtt gggttggtta ccttatgaca 
ACACTTGGGT AAGAAGTCTA GCCCATAGTC CCCGTCGTTC TACCTTACCC TCGTCCTGGT TCACACTTAT TAATGTACAA CCCAACCAAT GGAATACTGT 
CEP I L Q I GYQ GQQ DGMG AGP SVN N Y M L G W L P Y D> 

C C AACTCT AT TTGA 
GGTTGAGATA AACT 
T N S I *> 

SEQ ID NO: 33 and SEQ ID NO: 34 

Arabidopsis AGL20 cDNA and Arabidopsis AGL20 amino acid sequence 

20 40 60 80 100 

ATGGTGAGGG GCAAAACTCA GATGAAGAGA ATAGAGAATG CAACAAGCAG ACAAGTGACT TTCTCCAAAA GAAGGAATGG TTTGTTGAAG AAAGCCTTTG 

TACCACTCCC CGTTTTGAGT CTACTTCTCT TATCTCTTAC GTTGTTCGTC TGTTCA CTGA AAGAGGTTTT CTTCCTTACC AAACAACTTC TTTCGGAAAC 

MVR GKTQ MKR I E N ATSR QVT FSK HRKG LLK K A F> 

120 140 160 180 200 

AGCTCTCAGT GCTTTGTGAT GCTGAAGTTT CTCTTATCAT CTTCTCTCCT AAAGGCAAAC TTTATGAATT CGCCAGCTCC AATATGCAAG ATACCATAGA 
TCGAGAGTCA CGAAACACTA CGACTTCAAA GAGAATAGTA GAAGAGAGGA TTTCCGTTTG AAATACTTAA GCGGTCGAGG TTATACGTTC TATGGTATCT 
ELSV LCD AEV SLII FSP KGK LYEF ASS NMQ DTID? 

220 240 260 280 300 

TCGTTATCTG AGGCATACTA AGGATCGAGT CAGCACCAAA CCGGTTTCTG AAGAAAATAT GCAGCATTTG AAATATGAAG CAGCAAACAT GATGAAGAAA 

AGCAATAGAC TCCGTATGAT TCCTAGCTCA GTCGTGGTTT GGCCAAAGAC TTCTTTTATA CGTCGTAAAC TTTATACTTC GTCGTTTGTA CTACTTCTTT 

R Y L RHT KDRV STK PVS EENM QHL KYE AANM MKK> 

320 340 360 380 400 

ATTGAACAAC TCGAAGCTTC TAAACGTAAA CTCTTGGGAG AAGGCATAGG AACATGCTCA ATCGAGGAGC TGCAACAGAT TGAGCAACAG CTTGAGAAAA 
TAACTTGTTG AGCTTCGAAG ATTTGCATTT GAGAACCCTC TTCCGTATCC TTGTACGAGT TAGCTCCTCG ACGTTGTCTA ACTCGTTGTC GAACTCTTTT 
I E Q LEAS KRK LLG EGIG TCS IEE LQQI EQQ LEKs 

420 440 460 480 500 

GTGTCAAATG TATTCGAGCA AGAAAGACTC AAGTGTTTAA GGAACAAATT GAGCAGCTCA AGCAAAAGGA GAAA GCTCTA GCTGCAGAAA ACGAGAAGCT 

CACAGTTTAC ATAAGCTCGT TCTTTCTGAG TTCACAAATT CCTTGTTTAA CTCGTCGAGT TCGTTTTCCT CTTTCGAGAT CGACGTCTTT TGCTCTTCGA 

SVKC IRA RKT QVFK E Q I E Q L K Q K E KAL A A E N E K L> 

S20 S40 560 5B0 500 

CTCTGAAAAG TGGGGATCTC ATGAAAGCGA AGTTTGGTCA AATAAGAATC AAGAAAGTAC TGGAAGAGGT GATGAAGAGA GTAGCCCAAG TTCTGAAGTA 
GAGACTTTTC ACC CCTAGAG TACTTTCGCT TCAAACCAGT TTATTCTTAG TTCTTTCATG ACCTTCTCCA CTACTTCTCT CATCGGGTTC AAGACTTCAT 
SEK WGS HESE VWS NKN QEST GRG DEE SSPS SEV> 

620 640 
GAGACGCAAT TGTTCATTGG GTT AC CTTGT TCTTCAAGAA AGTGA 
CTCTGCGTTA ACAAGTAACC CAATGGAACA AGAAGTTCTT TCACT 
ETQ L F I G LPC SSR K*> 

SEQ ID NO: 35 and SEQ ID NO: 36 

Arabidopsis AGL22 cDNA and Arabidopsis AGL22 amino acid sequence 

20 40 60 SO 100 

ATGGCGAGAG AAAAGATTCA GATCAGGAAG ATCGACAACG CAACGGCGAG ACAAGTGACG TTTTCGAAAC GAAGAAGAGG GCTTTTCAAG AAAGCTGAAG 
TACCGCTCTC TTTTCTAAGT CTAGTCCTTC TAGCTGTTGC GTTGCCGCTC TGTTCACTGC AAAAGCTTTG CTTCTTCTCC CGAAAAGTTC TTTCGACTTC 
MAR EKIQ IRK IDN ATAR QVT FSK RRRG LFK K A E> 

120 140 160 180 200 

AACTCTCCGT TCTCTGCGAC GCCGATGTCG CTCTCATCAT CTTCTCTTCC ACCGGAAAAC TGTTCGAGTT CTGTAGCTCC AGCATGAAGG AAGTCCTAGA 

TTGAGAGGCA AGAGACGCTG CGGCTACAGC GAGAGTAGTA GAAGAGAAGG TGGCCTTTTG ACAAGCTCAA GACATCGAGG TCGTACTTCC TTCAGGATCT 

ELSV LCD A D V ALII FSS TGK LFEF CSS SMK EVLEs 

220 240 260 280 300 
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GAGGCATAAC TTHCAGTCAA AGAACTTGGA GAAGCTTCAT CAGCCATCTC TTGAGTTACA GCTGGTTGAG AACAGTGATC ACGCCCGAAT GAGTAAAGAA 
CTCCGTATTG AANGTCAGTT TCTTGAACCT CTTCGAAGTA GTCGGTAGAG AACTCAATGT CGACCAACTC TTGTCACTAG TGCGGGCTTA CTCATTTCTT 
HHN X Q S KMLE KLH Q P S LSLQ LVE N S D HARM S K E> 

320 340 360 380 400 

ATTGCGGACA AGAGCCACCG ACTAAGGCAA ATGAGAGGAG AGGAACTTCA AGGACTTGAC ATTGAAGAGC TTCAGCAGCT AGAGAAGGCC CTTGAAACTG 
TAACGCCTGT TCTCGGTGGC TGATTCCGTT TACTCTCCTC TCCTTGAAGT TCCTGAACTG TAACTTCTCG AAGTCGTCGA TCTCTTCCGG GAACTTTGAC 
IAD KSH8 LBO MRG EELO GLD IEE L Q Q L E X A LET> 

420 440 4S0 430 500 

GTTTGACGCG TGTGATTGAA ACAAAGAGTG ACAAGATTAT GAGTGAGATC AGCGAACTTC AGAAAAAGGG AATGCAATTG ATGGATGAGA ACAAGCGGTT 
CAAACTGCGC ACACTAACTT TGTTTCTCAC TGTTCTAATA CTCACTCTAG TCGCTTGAAG TCTTTTTCCC TTACGTTAAC TACCTACTCT TGTTCGCCAA 
GLTR VIE TKS DKIM SEI SEL Q K K G MQL MDE NKRL> 

520 540 560 580 600 

GAGGCAGCAA GTATGTGTCT TACCCTCTCT GTTGATAACA AATCCCTTTC TTTTGTCTAC CATTAACGTA CACACTCCTA AATTTAATCC CCAGTTGTCT 

CTCCGTCGTT CATACACAGA ATGGGAGAGA CAACTATTGT TTAGGGAAAG AAAACAGATG GTAATTGCAT GTGTGAGGAT TTAAATTAGG GGTCAACAGA 

RQQ VCV L P S L LIT N P F LLST INV HTP KFNP QLS> 

620 

ACAACACATA TGTTTGATCA TACTGTGAGA TAA 
TGTTGTGTAT ACAAACTAGT ATGACACTCT ATT 
TTH MFDH TVR *> 

SEQ ID NO: 37 and SEQ ID NO: 38 

Arabidopsis AGL24 cDNA and Arabidopsis AGL24 amino acid sequence 

20 40 so 80 100 

ATGGCGAGAG AGAAGATAAG GATAAAGAAG ATTGATAACA TAACAGCGAG ACAAGTTACT TTCTCAAAGA GAAGAAGAGG AATCTTCAAG AAAGCCGATG 
TACCGCTCTC TCTTCTATTC CTATTTCTTC TAACTATTGT ATTGTCGCTC TGTTCAATGA AAGAGTTTCT CTTCTTCTCC TTAGAAGTTC TTTCGGCTAC 
MAR EKIR IKK IDN ITAR QVT FSK RRRG IFK K A D> 

120 140 160 160 200 

AACTTTCAGT TCTTTGCGAT GCTGATGTTG CTCTCATCAT CTTCTCTGCC ACCGGAAAGC TCTTCGAGTT CTCCAGCTCA AGAATGAGAG ACATATTGGG 
TTGAAAGTCA AGAAACGCTA CGACTACAAC GAGAGTAGTA GAAGAGACGG TGGCCTTTCG AGAAGCTCAA GAGGTCGAGT TCTTACTCTC TGTATAACCC 
ELSV LCD A D V ALII FSA TGK LFEF SSS RMR D I L G> 

220 240 260 280 300 

AAGGTATAGT CTTCATGCAA GTAACATCAA CAAATTGATG GATCCACCTT CTACTCATCT CCGGCTTGAG AATTGTAACC TCTCCAGACT AAGTAAGGAA 
TTCCATATCA GAAGTACGTT CATTGTAGTT GTTTAACTAC CTAGGTGGAA GATGAGTAGA GGCCGAACTC TTAACATTGG AGAGGTCTGA TTCATTCCTT 
RYS LHA SNIN KLM DPP STHL RLE NCN LSRL SKE> 

320 340 360 3B0 400 

GTCGAAGACA AAACCAAGCA GCTACGGAAA CTGAGAGGAG AGGATCTTGA TGGATTGAAC TTAGAAGAGT TGCAGCGGCT GGAGAAACTA CTTGAATCCG 

CAGCTTCTGT TTTGGTTCGT CGATGCCTTT GACTCTCCTC TCCTAGAACT ACCTAACTTG AATCTTCTCA ACGTCGCCGA CCTCTTTGAT GAACTTAGGC 
VED KTKQ LRK LRG EDLD GLN LEE LORL EKL LES> 

420 440 460 480 500 

GACTTAGCCG TGTGTCTGAA AAGAAGGGCG AGTGTGTGAT GAGCCAAATT TTCTCACTTG AGAAACGGGG ATCGGAATTG GTGGATGAGA ATAAGAGACT 
CTGAATCGGC ACACAGACTT TTCTTCCCGC TCACACACTA CTCGGTTTAA AAGAGTGAAC TCTTTGCCCC TAGCCTTAAC CACCTACTCT TATTCTCTGA 
GLSR VSE KKG ECVM SQI FSL EKRG SEL VDE NKRL> 

520 540 560 550 600 

GAGGGATAAA CTAGAGACGT TGGAAAGGGC AAAACTGACG ACGCTTAAAG AGGCTTTGGA GACAGAGTCG GTGACCACAA ATGTGTCAAG CTACGACAGT 
CTCCCTATTT GATCTCTGCA ACCTTTCCCG TTTTGACTGC TGCGAATTTC TCCGAAACCT CTGTCTCAGC CACTGGTGTT TACACAGTTC GATGCTGTCA 
RDK LET LERA KLT TLK E A L E TES VTT HVSS YDS* 

€20 640 660 

GGAACTCCCC TTGAGGATGA CTCCGACACT TCCCTGAAGC TTGGGCTTCC ATCTTGOGAA TGA 
CCTTGAGGGG AACTCCTACT GAGGCTGTGA AGGGACTTCG AACCCGAAGG TAGAACCCTT ACT 
GTP LEDD SDT SLK LGLP SWE *> 

SEQ ID NO: 39 and SEQ ID NO: 40 

Arabidopsis AGL27 cDNA and Arabidopsis AGL27 amino acid sequence 

20 40 so BO 100 

ATGGGAAGAA GAAAAATCGA GATCAAGCGA ATCGAGAACA AAAGCAGTCG ACAAGTCACT TTCTCCAAAC GACGCAATGG TCTCATCGAC AAAGCTCGAC 
T AC CCTTCTT CTTTTTAGCT CTAGTTCGCT TAGCTCTTGT TTTCGTCAGC TGTTCAGTGA AAGAGGTTTG CTGCGTTACC AGAGTAGCTG TTTCGAGCTG 
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MGR RKIB I K R I E N KSSR QVT FSK RRNG LID K A R> 

120 140 160 180 200 

AACTTTCGAT TCTCTGTGAA TCCTCCGTCG CTGTTGTCGT CGTATCTGCC TCCGGAAAAC TCTATGACTC TTCCTCCGGT GACGACATTT CCAAGATCAT 
TTGAAAGCTA AGAGACACTT AGGAGGCAGC GACAACAGCA GCATAGACGG AGGCCTTTTG AGATACTGAG AAGGAGGCCA CTGCTGTAAA GGTTCTAGTA 
QLS1 L C E SSV AVVV VSA SGK LYDS SSG DDI SKI Is 

220 240 260 280 300 

TGATCGTTAT GAAATACAAC ATGCTGATGA ACTTAGAGCC TTAGATCTTG AAGAAAAAAT TCAGAATTAT CTTCCACACA AGGAGTTACT AGAAACAGTC 
ACTAGCAATA CTTTATGTTG TACGACTACT TGAATCTCGG AATCTAGAAC TTCTTTTTTA AGTCTTAATA GAAGGTGTGT TCCTCAATGA TCTTTGTCAG 
DRY EIQ HADE LRA LDL EEKI QNY LPH KELL ETV> 

320 340 360 380 400 

CAAAGCAAGC TTGAAGAACC AAATGTCGAT AATGTAAGTG TAGATTCTCT AATTTCTCTG GAGGAACAAC TTGAGACTGC TCTGTCCGTA AGTAGAGCTA 
GTTTCGTTCG AACTTCTTGG TTTACAGCTA TTACA TTCAC ATCTAAGAGA TTAAAGAGAC CTCCTTGTTG AACTCTGACG AGACAGGCAT TCATCTCGAT 
QSK LEEP NVD N V S VDSL ISL E E Q LETA LSV SRA> 

420 440 460 460 500 

GGAAGGCAGA ACTGATGATG GAGTATATCG AGTCCCTTAA AGAAAAGGAG AAATTGCTGA GAGAAGAGAA CCAGGTTCTG GCTAGCCAGC TGTCAGAGAA 
CCTTCCGTCT TGACTACTAC CTCATATAGC TCAGGGAA TT TCTTTTCCTC TTTAACGACT CTCTTCTCTT GGTCCAAGAC CGATCGGTCG ACAGTCTCTT 
R K A E LMM EYI ESLK EKE KLL REEN Q V !■ ASQ LSEK> 

520 540 560 580 600 

GAAAGGTATG TCTCACCGAT GAAAGATACT CAAAACCCGA TGGGAAAGAA TACGTTGCTG GCAACAGATG ATGAGAGAGG AATGTTTCCG GGAAGTAGCT 

CTTTCCATAC AGAGTGGCTA CTTTCTATGA GTTTTGGGCT ACCCTTTCTT ATGCAACGAC CGTTGTCTAC TACTCTCTCC TTACAAAGGC CCTTCATCGA 

KGM SHR *KIL KTR WER IRCW QQM MRE ECFR EVA> 

620 • 640 660 680 

CCGGCAACAA AATACCGGAG ACTCTCCCGC TGCTCAATTA GCCACCATCA TCAACGGCTG AGTTTTCACC TTAAACTCAA AGCCTGA 
GGCCGTTGTT TTATGGCCTC TGAGAGGGCG ACGAGTTAAT CGGTGGTAGT AGTTGCCGAC TCAAAAGTGG AATTTGAGTT TCGGACT 
PAT KYRR LSR CSI SHHH 0 R L SFH LKLK A *> 

SEQ ID NO: 41 

Arabidopsis SEP1 genomic sequence 

-2981 -2961 -2941 -2921 -2901 

CAGATCTCTT GGCA TGTGTC GAAAATGTGG AGATCTTAAG AATGTAGCTT GTGGCCGTTG CAAAGGAACA GGAACAATCA AATCAGGAGG ATTCTTTGGT 
GTCTAGAGAA CCGTACACAG CTTTTACACC TCTAGAATTC TTACATCGAA CACCGGCAAC GTTTCCTTGT CCTTGTTAGT TTAGTCCTCC TAAGAAACCA 

-28B1 -2861 -2841 -2B21 -2801 

TTCAGTGACT CATCAAACAC AAGATCAGTG GCTTGCGATA ATTGCCAAGC CAAAGGTTGT TTCCCTTGCC CTGAATGCTC AAAATCTTGA CCATTTTCTC 
AAGTCACTGA GTAGTTTGTG ttctagtcac cgaacgctat TAACGGTTCG GTTTCCAACA AAGGGAACGG GACTTACGAG TTTTAGAACT GGTAAAAGAG 

-2781 -2761 -2741 -2721 -2701 

GGTATTTTAT AGTTGTTTCA TCTTCTTGAC ACTATGATAA GTGTAATCGG TCCATTGGTA ATGGTAATGT TAAAGTTGAA GAATGTCTTG TTTATTCGAG 
CCATAAAATA TCAACAAAGT AGAAGAACTG TGATACTATT CACATTAGCC AGGTAACCAT TACCATTACA ATTTCAACTT CTTACAGAAC AAATAAGCTC 

-2681 -2661 -2641 -2621 -2601 

AAGTCTCTTA TTCCAATTCT rGATCTGTTA CTGCAAATAA GGCACTTTGC TTAGATGTAC CGGATGCTTA TGAATTACTG AGTAGGTTAA CTTTAACCGG 
TTCAGAGAAT AAGGTTAAGA ACTAGACAAT GACGTTTATT CCGTGAAACG AATCTACATG GC CTACGAAT ACTTAATGAC TCATCCAATT GAAATTGGCC 

-2581 -2561 -2541 -2521 -2501 

GTTTTATCGT CATTAAACCG GAGAAATTCA TCTAGTAACC AAATGCTCTG CTGGACCTTT CTTTCAGTGA GCAACTATAG GTGGGTTTTT GGCAGTTGAT 
CAAAATAGCA GTAATTTGGC CTCTTTAAGT AGATCATTGG TTTACGAGAC GACCTGGAAA GAAAGTCACT CGTTGATATC CACCCAAAAA CCGTCAACTA 

-2481 -2461 -2441 -2421 -2401 

GTACCATAAT TGGTGCAAAC ACACATTTTT CTTGAATTTT TGTTTAACTT AAATAAAGTT ACTTCGTTTT CTTGTTTTTT TTAATATGAA TAAAAAAAAT 
CATGGTATTA ACCACGTTTG TGTGTAAAAA GAACTTAAAA ACAAATTGAA TTTATTTCAA TGAAGCAAAA GAACAAAAAA AATTATACTT ATTTTTTTTA 

-2381 -2361 -2341 -2321 -2301 

CAACCATAAC TGATAGTAGG TTGGTTATCT TTATCAAAAC AAATAAAGTT AATAGGCAGA AAAATAATTG TCTATAGAAT CAATTATGAA AATGCCATTT 
GTTGGTATTG ACTATCATCC AACCAATAGA AATAGTTTTG TTTATTTCAA TTATC CGTCT TTTTATTAAC AGATATCTTA GTTAATACTT TTACGGTAAA 

-22B1 -2261 -2241 -2221 -2201 

TTTGGGATGG CATTTGTGGA TTTTGCCCTT TTTTTAATAG TTTGTGAATT TTGCCATTTT TCAGGTTACG TGAATGAATA TACGTTTTAT TCATTATGTT 
AAACCCTACC GTAAA CACCT AAAACGGGAA AAAAA TTATC AAACACTTAA AACGGTAAAA AGTCCAATGC ACTTACTTAT ATGCAAAATA AGTAATACAA 



108 



TGGGTTTACT CGGTTGTGGT TGTTCTTAGG GTTTAGTATT TTGTGTAAAC TACGTATTTT TACCAAAAAA AGTCCGAAAT CCATATATTT TTAAATCTTA 
ACCCAAATGA GCCAACACCA ACAAGAATCC CAAATCATAA AACACATTTG ATGCATAAAA ATGGTTTTTT TCAGGCTTTA GGTATATAAA AATTTAGAAT 

-2081 -2061 -2041 -2021 -2001 

GAAAATGGCT TATCCGTAAG ATTTTAGTAA AAATGGCAAT TTCAAAAGAT CTCTATAAAA AATGGCAAAA TCAACAATAA TCCCTTGTCT ATATGGTGGT 
CTTTTACCGA ATAGGCATTC TAAAATCATT TTTACCGTTA AAGTTTTCTA GAGATATTTT TTACCGTTTT AGTTGTTATT AGGGAACAGA TATACCACCA 

-1981 -1961 -1941 -1921 -1901 

ATTTCTGCTA AAAGTGACTT ATGGGTAGAT TTTTTAGCTT CATAGATTCT TTGTCGAAAA AAAATTACTT TGTACATTTT AGTGGAGTTA TTTAAATTTC 
TAAAGACGAT TTTCACTGAA TACCCATCTA AAAAATCGAA GTATCTAAGA AACAGCTTTT TTTTAATGAA ACATGTAAAA TCACCTCAAT AAATTTAAAG 

-1881 -1861 -1841 -1621 -1S01 

CCAATTGAAC AAAACCATAT ATTGATGAAA TTCGCAAATG CAATCCAAAA ATAAATATGT TCCACTCTTT TGGTTAGCTT TTAACTAAAG ATGCGTTTTA 
GGTTAACTTG TTTTGGTATA TAACTACTTT AAGCGTTTAC GTTAGGTTTT TATTTATACA AGGTGAGAAA ACCAATCGAA AATTGATTTC TACGCAAAAT 

-1781 -1761 -1741 -1721 -1701 

CTTTATGTAA GTGGTTGATC TTTTGGCAAT GGGGGACAAT GACTATACAA TCTAAGAGAT CATTTTAACG AATATCATTC ATATTTCATC CTCTTCTTCA 
GAAATACATT CACCAACTAG AAAACCGTTA CCCCCTGTTA CTGATATGTT AGATTCTCTA GTAAAATTGC TTATAGTAAG TATAAAGTAG GAGAAGAAGT 

-1681 -1661 -1641 -1621 -1601 

AATTTCAGTT TCACTAATTA ACCACGTTTC AATTGTAGTG TATCGCGAGC TGTAAATATT ATCTAATTTA TGTTACATAA TCATAACTGT AATCTTTATT 
TTAAAGTCAA AGTGATTAAT TGGTGCAAAG TTAACATCAC ATAGCGCTCG ACATTTATAA TAGATTAAAT ACAATGTATT AGTATTGACA TTAGAAATAA 

-1581 -1561 -1541 -1521 -1501 

AGACAAAAAC ATATATACCT CACTGCAAAC ACCTTCAAAC ATGGATAACT TGATTTAGGC ATACAAATAT TATTTCTCAT TTATTTGATA TGACCTATAT 
TCTGTTTTTG TATATATGGA GTGACGTTTG TGGAAGTTTG TACCTATTGA ACTAAATCCG TATGTTTATA ATAAAGAGTA AATAAACTAT ACTGGATATA 

-1481 -1461 -1441 -1421 -1401 

TATGTGGCTA TTTTATCAGT TTTAGTGTTT TTTATGATAA TTGAACCACT TAAATGTTTA TCTCATTTTT CAATTTATTT TAAACTGAAT TAAAAAGTAA 
ATACACCGAT AAAATAGTCA AAATCACAAA AAATACTATT AACTTGGTGA ATTTACAAAT AGAGTAAAAA GTTAAATAAA ATTTGACTTA ATTTTTCATT 

-1381 -1361 -1341 -1321 -1301 

GAAAGTATGA TCCAATAAGG CATCGACACA TGGAAACCCA TTTTAAGGTA GAAGATGCTT TTCTGCGGCT TCTGAAAACA ACTAGAAAAT GATATGATAC 
CTTTCATACT AGGTTATTCC GTAGCTGTGT ACCTTTGGGT AAAATTCCAT CTTCTACGAA AAGACGCCGA AGACTTTTGT TGATCTTTTA CTATACTATG 

-1281 -1261 -1241 -1221 -1201 

GTTGCTTTCA TTTATTGTAA GTATTATTTA GTTTTAATTC ACGCGCTTCA TATCCAGCTG CAAGACTACT ACAACTTGCA ATTATGAGAC TCTCGTTAGA 
CAACGAAAGT AAATAACATT CATAATAAAT CAAAATTAAG TGCGCGAAGT ATAGGTXTGAC GTTCTGATGA TGTTGAACGT TAATACTCTG AGAGCAATCT 

-1161 -1161 -1141 -1121 -1101 

AAATTACCAG GTATAATTTA AAAACAAAAA GAACTAGAAT ATATTGGCAA TTATTTGAAG TAAGAAAATA TGAGATTCTT GACCGAGTTG TTAAACTATC 
TTTAATGGTC CATATTAAAT TTTTGTTTTT CTTGATCTTA TATAACCGTT AATAAACTTC ATTCTTTTAT ACTCTAAGAA CTGGCTCAAC AATTTGATAG 

-1081 -1061 -1041 -1021 -1001 

AAACCCAAAA GTTTTOGTTA AAAAATAAGC TAG TACTA TG TACATATGTT TTATGTTGAA AATATATTAA ACTGTATGTA AGAGGGAGTG TACTTTCATT 
TTTGGGTTTT CAAAACCAAT TTTTTATTCG ATCATGATAC ATGTATACAA AATACAACTT TTATATAATT TGACATACAT TCTCCCTCAC ATGAAAGTAA 

-381 -961 -941 -921 -901 

TTAGATATAC ATTTCCAGCT AGTACGAGGT CTCTATATAT AAACTTTCTT AATATCGCTA AACAAATTTT ACTTTCAAGT TTGTAATGTG ATAAGTGAAA 
AATCTATATG TAAAGGTCGA TCATGCTCCA GAGATATATA TTTGAAAGAA TTATAGCGAT TTGTTTAAAA TGAAAGTTCA AACATTACAC TATTCACTTT 

-881 -861 -841 -821 -801 

GACCGTATAT ACATACACAT GTTAATCAAC TGATAACCTT TGTGCCTCGT GTGTCTAGTT ACTAGTCAAC CATCAAACGT GCATGATGCT GTTTTTCTTA 
CTGGCATATA TGTATGTGTA CAATTAGTTG ACTATTGGAA ACACGGAGCA CACAGATCAA TGATCAGTTG GTAGTTTGCA CGTACTACGA CAAAAAGAAT 

-781 -761 -741 -721 -701 

GAGTACTATT GTTGTGTTAT ATATAACTAA ACATAAACAA TTTGCTATTA TGATATAAAC ATAGAATTTT CAAGCAATGA TATGTTTAGA TGTTTTGTAT 

CTCATGATAA caacacaata tatattgatt tgtatttgtt aaacgataat actatatttg tatcttaaaa gttcgttact atacaaatct ACAAAACATA 

-681 -661 -641 -621 -601 

AAATATTCCA TAAATAGTAG ACACCCATAT ATACACAAAC ATGAATTCTA CCTGAGGAGA AACACATAGA TGTTCAAATT AAATAATAAC CCTATAATGA 
TTTATAAGGT ATTTATCATC TGTGGGTATA TATGTGTTTG TACTTAAGAT GGACTCCTCT TTGTGTATCT ACAAGTTTAA TTTATTATTG GGATATTACT 

-5B1 -561 -541 -521 -501 

AAACTCTAAA GTAAGTAATA CGAAATAAAA ATTTATCCTT TAAATAACAT ATAAACATAT ATATACAAGT TTAATTGGTA ATTGTATCAC AAGAGCCAAT 
TTTGAGATTT CATTCATTAT GCTTTATTTT TAAATAGGAA ATTTATTGTA TATTTGTATA TATATGTTCA AATTAACCAT TAACATAGTG TTCTCGGTTA 

-481 -461 -441 -421 -401 
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TATTTGGTGA CTGTATCACA CGTGCTTAAA GAGAGCGTGG GAATGAAAGT AAAGAAGAAT AAAGAAGCAG AGAGATGGGC TAGAAATGAG AAAACACACC 
ATAAACCACT GACATAGTGT GCACGAATTT CTCTCGCACC CTTACTTTCA TTTCTTCTTA TTTCTTCGTC TCTCTACCCG ATCTTTACTC TTTTGTGTGG 



-381 -361 -341 -321 -301 

AAACCCTAAC CTCACCCTCA CACATTTCTT ATCTTTTGCT CTCAATAGAT TCCATTGATT CAAAACAAAA TTTTCATTAA GATTTCACAA CCTCCACACA 
TTTGGGATTG GAGTGGGAGT GTGTAAAGAA TAGAAAACGA GAGTTATCTA AGGTAACTAA GTTTTGTTTT AAAAGTAATT CTAAAGTGTT GGAGGTGTGT 

.281 -261 -241 -221 -201 

CTTCCAAACA CAATTAAAGA GAGGAAAAAG AATCAATAAC CCTATAAATA AAAAATCAGA CAAACAGAAG TTTCCTCTTC TTCTTCCTTA AGCTAGTACC 
GAAGGTTTGT GTTAATTTCT CTCCTTTTTC TTAGTTATTG GGATATTTAT TTTTTAGTCT GTTTGTCTTC AAAGGAGAAG AAGAAGGAAT TCGATCATGG 

-1B1 -161 -141 -121 -101 

TTTTGTTCTT GAAATTAGGG TTAATTTCTT TTTTCCAAAT ACCATCAATT CTCCAGACCA TAAAAACTCA AAAAGATCAG ATCTTTC CTC TGAAAAAGAG 
AAAACAAGAA CTTTAATCCC AATTAAAGAA AAAAGGTTTA TGGTAGTTAA GAGGTCTGGT ATTTTTGAGT TTTTCTAGTC TAGAAAGGAG ACTTTTTCTC 

-81 -61 -41 -21 "I 

ATACCCAACT TATGTTTTTG TGTGTCTGTA TATAGATAAA CATTACATAC C CATATTTGT GTATAGACAT AAAAAGTGGA AATTAAGGTA ACAAAAAGAA 
TATGGGTTGA ATACAAAAAC ACACAGACAT ATATCTATTT GTAATGTATG GGTATAAACA CATATCTGTA TTTTTCACCT TTAATTCCAT TGTTTTTCTT 

20 40 60 80 100 

ATGGGAAGAG GAAGAGTAGA GCTGAAGAGG ATAGAGAACA AAATCAACAG ACAAGTAACG TTTGCAAAGC GTAGGAACGG TTTGTTGAAG AAAGCTTATG 
TACCCTTCTC CTTCTCATCT CGACTTCTCC TATCTCTTGT TTTAGTTGTC TGTTCATTGC AAACGTTTCG CATCCTTGCC AAACAACTTC TTTCGAATAC 

120 140 160 ISO 200 

AATTGTCTGT TCTCTGTGAT GCTGAAGTTG CTCTCATCAT CTTCTCCAAC CGTGGAAAGC TCTATGAGTT TTGCAGCTCC TCAAAGTAAA CAACTCTCTC 
TTAACAGACA AGAGACACTA CGACTTCAAC GAGAGTAGTA GAAGAGGTTG GCACCTTTCG AGATACTCAA AACGTCGAGG AGTTTCAnT GTTGAGAGAG 

220 240 260 280 300 

ACTCTTTATC AGTTTCTTGA TTGAGTTTTT GCTAGATCTG AGCTTAGATC TTTGTCTCAA GGACTTGTTA TATATAGATC ACACGATCTT GATTTCTACG 
TGAGAAATAG TCAAAGAACT AACTCAAAAA CGATCTAGAC TCGAATCTAG AAACAGAGTT CCTGAACAAT ATATATCTAG TGTGCTAGAA CTAAAGATGC 

320 340 360 380 400 

AAGTTGAGTT AATTAGATTT CTTGATTTCA TTTTCTAGGG TTTTTTTCCA ATTCTTGAAA TTTAAGATCT GGTTTTTTTG TTGTCAATGA TTTAGAACTG 
TTCAACTCAA TTAATCTAAA GAACTAAAGT AAAAGATCCC AAAAAAAGGT TAAGAACTTT AAATTCTAGA CCAAAAAAAC AACAGTTACT AAATCTTGAC 

420 440 4S0 480 500 

TGAATTTTGT AATCGAATAG ATTCCAAATC CTGATATGCA ATCTGAAAAG TTTTATATAA TTAATATATG TCTGTGTGAT TGGAAACTTA AAAGTTGTTC 
ACTTAAAACA TTAGCTTATC TAAGGTTTAG GACTATACGT TAGACTTTTC AAAATATATT AATTATATAC AGAGACACTA ACCTTTGAAT TTTCAACAAG 

520 540 560 580 600 

ACAGATTTCT ATGAAAATTA CAAGTATCCA ACGTAGAATG ATAATATATG GTTACATGCA TTAAC CATTT GTTAGTTCAT CATACTTTAT GGTGGTTAAA 
TGTCTAAAGA TACTTTTAAT GTTCATAGGT TGCATCTTAC TATTATATAC CAATGTACGT AATTGGTAAA CAATCAAGTA GTATGAAATA CCACCAATTT 

620 640 660 680 700 

ACTTCAAACG CGTGTATATC TGTGAAGGCT TTGATTGTTT GTTTTTTCTT AAAAACAATG TTTAATAGAT TTTTAATTAT ATGTTAAAAT AGTTTTGCTT 
TGAAGTTTGC GCACATATAG ACACTTCCGA AACTAACAAA CAAAAAAGAA TTTTTGTTAC AAATTATCTA AAAATTAATA TACAATTTTA TCAAAACGAA 

720 740 760 780 BOO 

ACATGCATTC AAGAAAATAT AGCGATTAAT TCCTTTTTTC AAATCACAAT TTGTGAATCA AACGAAAACG TAAGATATTG CTTGCAAATG ATAGGATTGA 
TGTACGTAAG TTCTTTTATA TCGCTAATTA AGGAAAAAAG TTTAGTGTTA AACACTTAGT TTGCTTTTGC ATTCTATAAC GAACGTTTAC TATCCTAACT 

820 640 S60 880 900 

ACTATTGATA TTTGTAAATA TAAATACGAA ACTTTACGTT TGAAAGTTGA AACAATCAAA TCCAAATCAA CTCGTATATA ATCAGATAAA TAATGGAAAC 
TGATAACTAT AAACATTTAT ATTTATGCTT TGAAATGCAA ACTTTCAACT TTGTTAGTTT AGGTTTAGTT GAGCATATAT TAGTCTATTT ATTACCTTTG 

920 940 960 980 1000 

AATCTTCAAT TTTGATGGAA GAATACTTTA AAACTTGAAG AGCTTTTTTT TTATGGTGAT TTATAGGTTT AGATCTCCAA AGTCAAGTAT GATCTTTTTA 
TTAGAAGTTA AAACTACCTT CTTATGAAAT TTTGAACTTC TCGAAAAAAA AATACCACTA AATATCCAAA TCTAGAGGTT TCAGTTCATA CTAGAAAAAT 

1020 1040 1060 1080 1100 

ATAAACTCTT ATTCTCTCTT TTTGAGTTAT TTTCAGCATG CTCAAGACAC TTGATCGGTA CCAGAAATGC AGCTATGGAT CCATTGAAGT CAACAACAAA 
TATTTGAGAA TAAGAGAGAA AAACTCAATA AAAGTCGTAC GAGTTCTGTG AACTAGCCAT GGTCTTTACG TCGATAC CTA GGTAACTTCA GTTGTTGTTT 

1120 1140 1160 HBO 1200 

CCTGCCAAAG AACTTGAGGT GTTCTTAATT CAAATACTAT TTTAGATTCC TATCATATCA TTTCAAGAAA GATCTTTTTT AAAAGTTTGT TTTCGTGAAA 
GGACGGTTTC TTGAACTCCA CAAGAATTAA GTTTATGATA AAATCTAAGG ATAGTATAGT AAAGTTCTTT CTAGAAAAAA TTTTCAAACA AAAGCACTTT 



1220 



1300 
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TATTTCAGAA CAGCTACAGA GAATATCTGA AGCTTAAGGG TAGATATGAG AACCTTCAAC GTCAACAGAG GTACATATCT GTCTACCTCC GTATATTTAC 
ATAAAGTCTT GTCGATGTCT CTTATAGACT TCGAATTCCC ATCTATACTC TTGGAAGTTG CAGTTGTCTC CATGTATAGA CAGATGGAGG CATATAAATG 

1320 1340 1360 1380 2400 

TCAATTCTGT ATCCATGTAG ATTCATATTT GTAGGTGTGT GTGGCTTTTG TTGGTGCAGA AATCTTCTTG GGGAGGATTT AGGACCTTTG AATTCAAAGG 
AGTTAAGACA TAGGTACATC TAAGTATAAA CATCCACACA CACCGAAAAC AACCACGTCT TTAGAAGAAC CCCTCCTAAA TCCTGGAAAC 7TAAGTTTCC 

1420 1440 1450 1430 1500 

AGTTAGAGCA GCTTGAGCGT CAACTGGACG GCTCTCTCAA GCAAGTTCGG TCCATCAAGG TATCTTTATA CATGGAATCA ATGATTCAAA TGAGATTAAT 
TCAATCTCGT CGAACTCGCA GTTGACCTGC CGAGAGAGTT CGTTCAAGCC AGGTAGTTCC ATAGAAATAT GTACCTTAGT TACTAAGTTT ACTCTAATTA 

1520 1540 1560 1580 1600 

TTGTGTTGTT TAATTATAAC TACTATGGTG GTATGATGAT TGTTTGCAGA CACAGTACAT GCTTGACCAG CTCTCGGATC TTCAAAATAA AGAGCAAATG 
AACACAACAA ATTAATATTG ATGATAC C AC CATACTACTA ACAAACGTCT GTGTCATGTA CGAACTGGTC GAGAGCCTAG AAGTTTTATT TCTCGTTTAC 

1620 1640 1660 1680 1700 

TTGCTTGAAA CCAATAGAGC TTTGGCAATG AAGGTATAAT TACAGAATAA ATGCATTTGG TGCCTTGCGA TCAATCTCTT TCACAGAGTT TAAGTTTCTA 
AACGAACTTT GGTTATCTCG AAACCGTTAC TTC CAT ATT A ATGTCTTATT TACGTAAACC ACGGAACGCT AGTTAGAGAA AGTGTCTCAA ATTCAAAGAT 

1720 1740 1760 1730 1800 

AACATTTTTG GAAACATCTC TAGTTTTCTT GTTTCTGATT ATAGTCTTTT GGTGAAATGT AAATGTTTAG CTGGATGATA TGATTGGTGT GAGAAGTCAT 
TTGTAAAAAC CTTTGTAGAG ATCAAAAGAA CAAAGACTAA TATCAGAAAA CCACTTTACA TTTACAAATC GACCTACTAT ACTAACCACA CTCTTCAGTA 

1B20 1340 1860 1880 1900 

CATATGGGAG GAGGAGGAGG ATGGGAAGGT GGTGAACAGA ATGTTACCTA CGCGCATCAT CAAGCTCAGT CTCAGGGACT ATACCAGCCT CTTGAATGCA 
GTATACCCTC CTCCTCCTCC TACCCTTCCA CCACTTGTCT TACAATGGAT GCGCGTAGTA GTTCGAGTCA GAGTCCCTGA TATGGTCGGA GAACTTACGT 

1920 1940 1960 1930 2000 

ATCCAACTCT GCAAATGGGG TAAATCCTTT GCCTTAAACA ATCATCTGCA AATCAGCTTG TGTACTTCAC TACTAAGATT GTACTTATAT AAGGTTCTTT 
TAGGTTGAGA CGTTTACCCC ATTTAGGAAA CGGAATTTGT TAGTAGACGT TTAGTCGAAC ACATGAAGTG ATGATTCTAA CATGAATATA TTCCAAGAAA 

2020 2040 2060 2080 2100 

AGTTACTTGG TGTAAAGAGG ATCATCAATG TGTGTGAACC TTTTAAGTTG CTGTTTTGGT GATGATGATG ATGATGACAG GTATGATAAT CCGGTATGCT 
TCAATGAACC ACATTTCTCC TAGTAGTTAC ACACACTTGG AAAATTCAAC GACAAAACCA CTACTACTAC TACTACTGTC CATACTATTA GGCCATACGA 

2120 2140 2160 

CAGAGCAAAT AACTGCGACA ACCCAAGCTC AGGCGCAGCA GGGAAACGGT TACATCCCGG GGTGGATGCT C 
GTCTCGTTTA TTGACGCTGT TGGGTTCGAG TCCGCGTCGT CCCTTTGCCA ATGTAGGGCC CCACCTACGA G 

SEQ ID NO: 42 

Arabidopsis SEP2 genomic sequence 

-2931 -2961 -2941 -2921 -2901 

ACGCTCTAAC CAACTGAGCT AATGGGCCAT TTGCGAATGG TAGTGTCTAT nTACTTATT CGAATCTAAA TCGTCATAGG TAATTAAGAA GACATGCAAA 
TGCGAGATTG GTTGACTCGA TTACCCGGTA AACGCTTACC ATCACAGATA AAATGAATAA GCTTAGATTT AGCAGTATCC ATTAATTCTT CTGTACGTTT 

-2831 -2361 -2841 -2821 -2801 

GCTTAATCAA TGATGGATTC TTTGATTCTA CTTCTAGGTG CCACCATTGA CGCATTCATA AAATCATAAC CGGTCGTTTA CAAAACATAT TGCTTGAATG 
CGAATTAGTT actacctaag AAACTAAGAT GAAGATCCAC GGTGGTAACT gcgtaagtat TTTAGTATTG GCCAGCAAAT GTTTTGTATA acgaacttac 

-2781 -2761 -2741 -2721 -2701 

ATTCTAAACA AATAATAGTT TTTTGTTGAA ATTTTCAAAA CATATGTTAG GTAAGGTCAG GTTTTGCCAA TAAGCCTTAC TATATACAGT GGCAACATGT 
TAAGATTTGT TTATTATCAA AAAACAACTT TAAAAGTTTT gtatacaatc CATTCCAGTC CAAAACGGTT ATTCGGAATG ATATATGTCA ccgttgtaca 

-2681 -2661 -2641 -2621 -2601 

TTCTTCTACT TTGGAGGATT TTGGGTGAAT ATGAAACCCA TGTGAGCATG ATACATGTGT TTCTTCTTCT ATTGAAATTT CCCCCAATGG TCATTTGCTC 
AAGAAGATGA AACCTCCTAA AACCCACTTA TACTTTGGGT acactcgtac TATGTACACA AAGAAGAAGA TAACTTTAAA GGGGGTTACC AGTAAACGAG 

-2581 -2561 -2541 -2521 -2501 

TTTGCGTTCG TGTTGCGCTT TCCGGTATCA AATCATATAT ATATATAACC TAAATGAGAC TAGACAATTT GAATCATTGT AAAAGGTATA AAGAAGAGAT 
AAACGCAAGC ACAACGCGAA AGGCCATAGT TTAGTATATA TATATATTGG ATTTACTCTG ATCTGTTAAA CTTAGTAACA TTTTCCATAT TTCTTCTCTA 

-2431 -2461 -2441 -2421 -2401 

TATAGTCCAC AATTAACAAA GTAATAAGAC GGTAAAATAT CAAACAAATT GAAAGGGTAA AAAAAAAACA AGAGGGACAA GTCACTGTTA GAAAGGTGAC 
ATATCAGGTG TTAATTGTTT CATTATTCTG CCATTTTATA GTTTGTTTAA CTTTCCCATT TTTTTTTTGT TCTCCCTGTT CAGTGACAAT CTTTCCACTG 

-2381 -2361 -2341 -2321 -2301 

TCCTCCCTTT GGGCCAGCCC CCTACCACAA AAGTCAAAGC TTACTTACTA TTCAGTCATA TATCGACACG TGTACTTCGA ACCACATCAC CCATCCTATT 
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AGGAGGGAAA CCCGGTCGGG GGATGGTGTT TTCAGTTTCG AATGAATGAT AAGTCAGTAT ATAGCTGTGC ACATGAAGCT TGGTGTAGTG GGTAGGATAA 

-22B1 -2261 -2241 -2221 -2201 

ACGTAATTTC CACTGTCTAG ACTTTTTTTT T T TlTT T m ' TTTTACTTTT TAACGTTTTT TAGCTGTCTC TCTAAATTAC TACATACGGA CTTGCTACGT 
TGCATTAAAG GTGACAGATC TGAAAAAAAA AAAAAAAAAA AAAATGAAAA ATTGCAAAAA ATCGACAGAG AGATTTAATG ATGTATGCCT GAACGATGCA 

-2181 -2161 -2141 -2121 -2101 

CACCTGAGAA GAAAGATCTT TGCTCGTAGA TTCTTTGTCT GAAGGAAAAT TATTTGTATT TAGTTATTTA CAATTGCATA ATTGTGTGTA GTAAATCCGC 
GTGGACTCTT CTTTCTAGAA ACGAGCATCT AAGAAACAGA CTTCCTTTTA ATAAACATAA ATCAATAAAT GTTAACGTAT TAACACACAT CATTTAGGCG 

-2091 -2061 -2041 -2021 -2001 

CAGAATGATA TTAGAGTGAT ACTGAGACGA CGAATGGTGT AACTTGTAAC ATATATACTA ATAAACACGA TTGATTAAAA ATTTACTATA CAGTATATCC 
GTCTTACTAT AATCTCACTA TGACTCTGCT GCTTACCACA TTGAACATTG TATATATGAT TATTTGTGCT AACTAATTTT TAAATGATAT GTCATATAGG 

-1981 -1961 -1941 -1921 -1901 

AAAACATTAT GATTGAGAGT GTACATATAC AATAAGTAAT TAAACCTCAA AACCAAACAG TTTTTTTTTT TTTTGGTCAA CAATAATTAG AAATGAGAAT 
TTTTGTAATA CTAACTCTCA CATGTATATG TTATTCATTA ATTTGGAGTT TTGGTTTGTC AAAAAAAAAA AAAACCAGTT GTTATTAATC TTTACTCTTA 

-1B81 -1861 -1841 -1821 -1801 

AAACTATTTA ACTTATAAAT TCTAGACCCA AAAACTCATA TTTTACCCTT CTTGGTCTCA CCTAAAAAGA CTTTAATTCC CAAAACTCTT GCAAACAATG 
TTTGATAAAT TGAATATTTA AGATCTGGGT TTTTGAGTAT AAAATGGGAA GAACCAGAGT GGATTTTTCT GAAATTAAGG GTTTTGAGAA CGTTTGTTAC 

-1781 -1761 -1741 -1721 -1701 

GCCAAACATA GAAGATTGGA AAACAAATTT AAATCTACTT TCACTTTTAT AAAGAATAAT CAACGAACCA ATTAAGTTAA ACCTACATAT ATTCGTATGT 
CCGTTTGTAT CTTCTAACCT TTTGTTTAAA TTTAGATGAA AGTGAAAATA TTTCTTATTA GTTGCTTGGT TAATTCAATT TGCATGTATA TAAGCATACA 

-1681 -1661 -1641 -1621 -1601 

GATCACATAT GTGTTATATT cctcacgttc tcttccattt agctaataac cttaattact tcaagaaatc atatatcaac cgaaaactag taaaataaat 

CTAGTGTATA CACAATATAA GGAGTGCAAG AGAAGGTAAA TCGATTATTG GAATTAATGA AGTTCTTTAG TATATAGTTG GCTTTTGATC ATTTTATTTA 

-1581 -1561 -1541 -1521 -1501 

ATACATACTG AAAGCGCGCA AAATTTTTAG CAATATTTTA AAATACCCTA CATCATAGTC TTAACTAATT AATCTTTCTG ATCAAAATTT ATTTTCATAA 
TATGTATGAC TTTCGCGCGT TTTAAAAATC GTTATAAAAT TTTATGGGAT GTAGTATCAG AATTGATTAA TTAGAAAGAC TAGTTTTAAA TAAAAGTATT 

-1481 -1461 -1441 -1421 -1401 

TATTCATAAA TACTTATGGA TTACCTAAAC CAGGATACTT ATCCCTATAA ATCTGTCAAT CATCATGGAT TCATGGAGAC ATGGTCAGAT ATCCCACGTC 
ATAAGTATTT ATGAATACCT AATGGATTTG GTCCTATGAA TAGGGATATT TAGACAGTTA GTAGTACCTA AGTACCTCTG TACCAGTCTA TAGGGTGCAG 

-1381 -1361 -1341 -1321 -1301 

CAGATACAAT GTAACATATT GATATACTGC GGCTGATTAT TATTTTTTAC ATTAGAACGA GTTTAGATCC AAAACAAAAT TGGTATTCTC AAACAAAAAT 
GTCTATGTTA CATTGTATAA CTATATGACG CCGACTAATA ATAAAAAATG TAATCTTGCT CAAATCTAGG TTTTGTTTTA ACCATAAGAG TTTGTTTTTA 

-1281 -1261 -1241 -1221 -1201 

TAAAAATTGA ATACGAAAGT AATAGAACAA AACTTCAATG TTGTCGAATA GATAGGAAGC AATAGAAAAG CGACACGTAC ATGTCCATTT TAAGGTAGGA 
ATTTTTAACT TATGCTTTCA TTATCTTGTT TTGAAGTTAC AACAGCTTAT CTATCCTTCG TTATCTTTTC GCTGTGCATG TACAGGTAAA ATTCCATCCT 

-1181 -1161 -1141 -1121 -1101 

GAGGCTTTTC TGCGGCTTGT GAAGTAAGAA AAAGAAAATG ATGATAGCTG CTTTCGTTTC ATTCATTGCA GAAGAAACCA ATGTTTCCCC AATCTCACGC 
CTCCGAAAAG ACGCCGAACA CTTCATTCTT TTTCTTTTAC TACTATCGAC GAAAGCAAAG TAAGTAACGT CTTCT7TGGT TACAAAGQGG TTAGAGTGCG 

-1031 -1061 -1041 -1021 -1001 

GCCTCCTCCT ATCTACCACC ACTTGGACAA ATCCCCTTTT CAGTATTAGT TTTTTTTTCC GGACATTGTA CATTCAAAAG CATTCCAAGT GTCTAATAAA 
CGGAGGAGGA TAGATGGTGG TGAACCTGTT TAGGGGAAAA GTCATAATCA AAAAAAAAGG CCTGTAACAT GTAAGTTTTC GTAAGGTTCA CAGATTATTT 

-981 -961 -941 -921 -901 

CATAACTAAC CACTCCAAGA TGCAAAATCT AGCTACGAAC AAATTTTAAA CTATAGAGAT GAACTTTAAA TTCGGGCATT AATTAGTGGA ACTTGAGCTA 
GTATTGATTG GTGAGGTTCT ACGTTTTAGA TCGATGCTTG TTTAAAATTT GATATCTCTA CTTGAAATTT AAGCCCGTAA TTAATCACCT TGAACTCGAT 

-881 -861 -841 -821 -801 

TTGATGAGTT TTCTGACTTT TTGAAGCTTA ATTGAGTTTT ATATACACTA TATATAGGCT TGTAATAATA TGGATCAAAC AAGAAATATA TAAACTACAA 
AACTACTCAA AAGACTGAAA AACTTCGAAT TAACTCAAAA TATATGTGAT ATATATCCGA ACATTATTAT ACCTAGTTTG TTCTTTATAT ATTTGATGTT 

-781 -761 -741 -721 -701 

ATTGGGAATT AGGTTTTAAA ACGTTATCGT TCTATTTTAA TTCAGGCACC TTTAGAATAT CAAGATCCAT GCATGTTTCA ATATTTCTGT TGACAAATAA 
TAACCCTTAA TCCAAAATTT TGCAATAGCA AGATAAAATT AAGTCCGTGG AAATCTTATA GTTCTAGGTA CGTACAAAGT TATAAAGACA ACTGTTTATT 

, -681 -661 -641 -621 -601 

ATAAAGATGT CTCAAATATG AAGTTTGGGC AACGTACGTG TAGACCTAAA AGAGTCGAAA CATTGGTATC TAAGTCATAT ATCTAGATGT ATATGGACAT 
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TATTTCTACA GAGTTTATAC TTCAAACCCG TTGCATGCAC ATCTGGATTT TCTCAGCTTT GTAACCATAG ATTCAGTATA TAGATCTACA TATACCTGTA 

-581 -561 -541 -521 -501 

GGATTATATA ACTAGACAAC GTTTGTTTTA AAAACTTAAT TCATTTTTCT TAATTAGTAG CAACTAGCAA CTAACTACTC ATGGCAAATA ATGGTGTCTG 
CCTAATATAT TGATCTGTTG CAAACAAAAT TTTTGAATTA AGTAAAAAGA ATTAATCATC GTTGATCGTT GATTGATGAG TACCGTTTAT TACCACAGAC 

-4B1 -461 -441 -421 -401 

CGTGGCACGC ACTTGGGAGA GAAGGTGTGA GAATGTTTTT TACTTTCTGT GTAAAAGATG GAAGAGAGAG AAAGAGTAAA GAAGTAGAGA GAGAGATATT 
GCACCGTGCG TGAACCCTCT CTTCCACACT CTTACAAAAA ATGAAAGACA CATTTTCTAC CTTCTCTCTC TTTCTCATTT CTTCATCTCT CTCTCTATAA 

-381 -361 -341 -321 -301 

GTATCACCAA ACCCTAATGA TCTCTCACCC TCACAAATTT TCTTATCTTT ATAGCTTTTA TAGATTCACA AAAACTTTTC TTCAGATTCA CAATCTCATC 
CATAGTGCTT TGGGATTACT AGAGAGTGGG AGTCTTTAAA AGAATAGAAA TATCGAAAAT ATCTAAGTGT TTTTGAAAAG AAGTCTAAGT GTTAGAGTAG 

-281 -261 -241 -221 -201 

ACAACCCTTC AAAAAGAGAA AAGATCTAAA GAATAAACAA GAGCCCTAAT ATCAAATCAC AACCAAAAAA ACCAAAGAAA GCTAATTAAA GTTTTCTCTC 
TGTTGGGAAG TTTTTCTCTT TTCTAGATTT CTTATTTGTT CTCGGGATTA TAGTTTAGTG TTGGTTTTTT TGGTTTCTTT CGATTAATTT CAAAAGAGAG 

-181 -161 -141 -121 -101 

TAGCTATTCC TCTTCTTTTC TTGTTCTTGA AAACTAGGGT TTACTTCACC AAAAGATAAG ATCTTTCCCC AGAAAAAGCA ATACCCAAGT CATGTTTCTG 
ATCGATAAGG AGAAGAAAAG AACAAGAACT TTTGATCCCA AATGAAGTGG TTTTCTATTC TAGAAAGGGG tctttttcgt tatgggttca GTACAAAGAC 

-81 -61 -41 -21 -1 

TGTGTCTGTA TATAGATAAA ACATTACATA CCCTAATAAG GTTACACAAA TAGCTATAAA AGAGGGAAAA TAAGATAGGG ATTTTTTGGG GTGAGGAAAG 
ACACAGACAT ATATCTATTT TGTAATGTAT GGGATTATTC CAATGTGTTT ATCGATATTT TCTCCCTTTT ATTCTATCCC TAAAAAACCC CACTCCTTTC 

20 40 60 80 100 

ATGGGAAGAG GAAGAGTAGA GCTCAAGAGG ATAGAGAACA AAATCAACAG ACAAGTGACG TTTGCTAAAC GTAGAAATGG TTTGCTGAAA AAAGCTTATG 
TACCCTTCTC CTTCTCATCT CGAGTTCTCC TATCTCTTGT TTTAGTTGTC TGTTCACTGC AAACGATTTG CATCTTTACC AAACGACTTT TTTCGAATAC 

120 140 160 180 200 

AGCTTTCTGT TCTCTGCGAT GCTGAAGTCT CTCTCATCGT CTTCTCCAAC CGTGGCAAGC TCTACGAGTT CTGCAGCACC TCCAAGTACT TCTCTTTCTT 
TCGAAAGACA AGAGACGCTA CGACTTCAGA GAGAGTAGCA GAAGAGGTTG GCACCGTTCG AGATGCTCAA GACGTCGTGG AGGTTCATGA AGAGAAAGAA 

220 240 260 280 300 

TATACACTTA TTAGATCTGT GTGTAGATCT TTCATTTTTC TAGTCTTGTG ATGAGTTTTA TCTTTCTTGA TTGCTTTTTA ACAAAATACT TGATATATTT 
ATATGTGAAT AATCTAGACA CACATCTAGA AAGTAAAAAG ATCAGAACAC TACTCAAAAT AGAAAGAACT AACGAAAAAT TGTTTTATGA ACTATATAAA 

320 340 360 380 400 

TCAGTTTCTT AATCTGATCT CTAATTAGGT TTTGATTATA GAAGAATAAT TCAGTACTTT CAAGTGATTG AATTTCGAGA TCTGATCTTA ATTTAATCAT 
AGTCAAAGAA TTAGACTAGA GATTAATCCA AAACTAATAT CTTCTTATTA AGTCATGAAA GTTCACTAAC TTAAAGCTCT AGACTAGAAT TAAATTAGTA 

420 440 460 460 500 

CATGTCAAAT TCTTAGGGAT TTAATTGCAA TCTATTTTTA GATTTATCGG AGCTAGGAAA GTATCATAAT GATATACTAT TATTATCATG TAATTTCATT 
GTACAGTTTA AGAATCCCTA AATTAACGTT AGATAAAAAT CTAAATAGCC TCGATCCTTT CATAGTATTA CTATATGATA ATAATAGTAC ATTAAAGTAA 

520 540 560 580 600 

GTCTCTACAC GGATATATAT GTGATTAGAA CTTGGTAAAG TAAACTAAAG ATTCACAGTC TTCAATGAAA TTTAAAAGAT CCAACGTAGA ATAATTAGTG 
CAGAGATGTG CCTATATATA CACTAATCTT gaaccatttc ATTTGATTTC TAAGTGTCAG AAGTTACTTT aaattttcta GGTTGCATCT tattaatcac 

620 640 660 680 700 

GTTCCATGCA TTAACCAGTC TAATTAAAGC TCATGCAGAC ATTTAAGCAC CACATGAATT TAATATCTTT TTAATTAAGG GATCTTCTTT TTATAAATTT 
CAAGGTACGT AATTGGTCAG ATTAATTTCG AGTACGTCTG TAAATTCGTG GTGTACTTAA ATTATAGAAA AATTAATTCC CTAGAAGAAA AATATTTAAA 

720 740 760 780 800 

TCTTTTGTTA GTTTTTAAAA TTTTAGTTTG TTCATTAAAT TTATAGATTC TTCTTCTCCT GATTTGTGTT TTTTGATCTT TCAGCATGCT CAAGACACTG 
AGAAAACAAT CAAAAATTTT AAAATCAAAC AAGTAATTTA AATATCTAAG AAGAAGAGGA CTAAACACAA AAAACTAGAA AGTCGTACGA GTTCTGTGAC 

B20 840 860 880 900 

GAAAGGTATC AGAAGTGTAG CTATGGCTCC ATTGAAGTCA ACAACAAACC TGCTAAAGAG CTTGAGGITT AATCTCCAAC ATCTCTTCGA TCTTAATTAT 
CTTTCCATAG TCTTCACATC GATACCGAGG TAACTTCAGT TGTTGTTTGG ACGATTTCTC GAACTCCAAA TTAGAGGTTG TAGAGAAGCT AGAATTAATA 

920 940 960 980 1000 

TTATCCTTTT TTAATTTTAT CTAAAGAAAA TGTTTGATTT TGAGACAAAA GCCCTTCAAA GTTTCTTACA TAGATATTCA ATTGTCTATT ATCTTCGCAA 
AATAGGAAAA AATTAAAATA GATTTCTTTT ACAAACTAAA ACTCTGTTTT CGGGAAGTTT CAAAGAATGT ATCTATAAGT TAACAGATAA TAGAAGCGTT 

1020 1040 1060 1080 1100 

TTTTCAGAAC AGCTACAGAG AGTACTTGAA GCTGAAAGGT AGATATGAAA ATCTGCAACG TCAGCAGAGG TATATACATT AATGTGGATG ATGATCATTT 
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AAAAGTCTTG TCGATGTCTC TCATGAACTT CGACTTTCCA TCTATACTTT TAGACGTTGC AGTCGTCTCC ATATATGTAA TTACACCTAC TACTAGTAAA 



1120 1140 1160 11SO 1200 

ATAAACAGCA TATATATATA TATATATATA TATATATATA GTTTGTATTG ATCATGAAAG TGTGTTGCTG CAGAAATCTT CTTGGAGAGG ATCTTGGACC 
TATTTGTCGT ATATATATAT ATATATATAT ATATATATAT CAAACATAAC TAGTACTTTC ACACAACGAC GTCTTTAGAA GAACCTCTCC TAGAACCTGG 



1220 1240 1260 1280 1300 

TCTGAATTCA AAGGAGCTAG AGCAGCTTGA GCGTCAACTA GACGGCTCTC TGAAGCAAGT TCGCTGCATC AAGGTGATTT ACTTCTGTAC ATACACTGAA 
AGACTTAAGT TTCCTCGATC TCGTCGAACT CGCAGTTGAT CTGCCGAGAG ACTTCGTTCA AGCGACGTAG TTCCACTAAA TGAAGACATG TATGTGACTT 



1320 1340 1360 1380 1400 

AGATTCACAC AAATCTTTCT CTATATAf AG ACTGAGACAC ATGCATGAAA TGTTTTTGAT GCGTGAGGTT ATCTGAAAAT GCCTCTTCTT TTTTGCAGAC 
TCTAAGTGTG TTTAGAAAGA GATATATATC TGACTCTGTG TACGTACTTT ACAAAAACTA CGCACTCCAA TAGACTTTTA CGGAGAAGAA AAAACGTCTG 



1420 1440 1460 1480 1500 

ACAGTATATG CTTGACCAGC TCTCTGATCT TCAAGGTAAG GAGCATATCT TGCTTGATGC CAACAGAGCT TTGTCAATGA AGGTATATGA TGATGTTTCT 
TGTCATATAC GAACTGGTCG AGAGACTAGA AGTTCCATTC CTCGTATAGA ACGAACTACG GTTGTCTCGA AACAGTTACT TCCATATACT ACTACAAAGA 



1SJ0 1540 1S60 1580 1600 

CTCTCTCTCC TCCAGTTTCT ATTTATAGAT GGAAACTTTA AATAGTCCAA TTTATATATA TGAGTCTAAA TTTCACATTC TTCAACTGCT ACATGTTTCT 
GAGAGAGAGG AGGTCAAAGA TAAATATCTA CCTTTGAAAT TTATCAGGTT AAATATATAT ACTCAGATTT AAAGTGTAAG AAGTTGACGA TGTACAAAGA 



1620 1S40 1660 1680 1700 

TTTGTATTAT TTCTATGATA TCTTCAGGAA AGTTTGAAAA ATATTGTGTT TTGTTTAGCT GGAAGATATG ATCGGCGTGA GACATCACCA TATAGGAGGA 

AAACATAATA AAGATACTAT AGAAGTCCTT TCAAACTTTT TATAACACAA AACAAATCGA CCTTCTATAC TAGCCGCACT CTGTAGTGGT ATATCCTCCT 

1720 1740 1760 1780 1800 

GGATGGGAAG GTGGTGATCA ACAGAATATT GCCTATGGAC ATCCTCAGGC TCATTCTCAG GGACTATACC AATCTCTTGA ATGTGATCCC ACTTTGCAAA 

CCTACCCTTC CACCACTAGT TGTCTTATAA CGGATACCTG TAGGAGTCCG AGTAAGAGTC CCTGATATGG TTAGAGAACT TACACTAGGG TGAAACGTTT 



1820 1840 1860 1880 1900 

TTGGGTAAAT CAAACAACTT TTCTTGCCTT AAGACATCAA CTTAGGTTAT AAACAGTTAG CAGTTTGCTT TAAGCCCAAC ATTGTCTTTG TTTCATAGAG 
AACCCATTTA GTTTGTTGAA AAGAACGGAA TTCTGTAGTT GAATCCAATA TTTGTCAATC GTCAAACGAA ATTCGGGTTG TAACAGAAAC AAAGTATCTC 



1920 1940 1960 1980 2000 

GCTTTGGTTA AAACTCGTGT TGTTTAGTCT AAGGATTCAG CACTTTGATG TCTGAAGTAT GGAAAATCAA TATCTCAGAC TTGAAAATGT GGGTTTCTAT 
CGAAACCAAT TTTGAGCACA ACAAA TCAGA TTCCTAAGTC GTGAAACTAC AGACTTCATA CCTTTTAGTT ATAGAGTCTG AACTTTTACA CCCAAAGATA 



2020 2040 2060 2080 2100 

TGTTGACTTC GAAACTATGT TGTTGTGGTG TTGCAAACAG ATATAGCCAT CCAGTGTGCT CAGAGCAAAT GGCTGTGACG GTGCAAGGTC AGTCCCAACA 
ACAACTGAAG CTTTGATACA ACAACACCAC AACGTTTGTC TATATCGGTA GGTCACACGA GTCTCGTTTA CCGACACTGC CACGTTCCAG TCAGGGTTGT 



2120 

AGGAAACGGC TACATCCCTG GCTGGATGCT G 
TCCTTTGCCG ATGTAGGGAC CGACCTACGA C 



SEQ ID NO: 43 

Arabidopsis SEP3 genomic sequence 

-2981 -2961 -2941 -2921 . -2901 

GTCCCCTTCC CATTACGTCT TGACGTGGAC CCTGTCCGTC TAT I ' lT TAGC AGATTAATCC AACGGTTCTT ATTCTTTCTT CGACCCTTCA CGACATTGCC 
CAGGGGAAGG GTAATGCAGA ACTGCACCTG GGACAGGCAG ATAAAAATCG TCTAATTAGG TTGCCAAGAA TAAGAAAGAA GCTGGGAAGT GCTGTAACGG 

-2SS1 -2851 -2841 -2321 -2301 

TCAAAGCCGT CCGATTCTCA TCTCACGCCC AATGGACCAC ATATATCACC AGTACTCCGC AACTTAGCTG TCGTGTAGGA TTTCACGTGG CATTTATTTG 
AGTTTCGGCA GGCTAAGAGT AGAGTGCGGG TTACCTGGTG TATATAGTGG TCATGAGGCG TTGAATCGAC AGCACATCCT AAAGTGCACC GTAAATAAAC 

-2781 -27S1 -2741 -2721 -2701 

TTCTAGTTTG TAGTGCAAAC ATTGCAAGTT GATATGGTCC CCTATCGATC ACCGTCGTCT CTTTAGCTTC ACATCGAGAT TCTTCTTTCT TTCCTACGTG 
AAGATCAAAC ATCACGTTTG TAACGTTCAA CTATACCAGG GGATAGCTAG TGGCAGCAGA GAAATCGAAG TGTAGCTCTA AGAAGAAAGA AAGGATGCAC 

-2681 -2661 -2641 -2621 -2601 

TAATAGCATT TTTGATTTTG AGAATTTCTT TAGAACCGTT GGATCTCTCA TCGTTGGTTG ATCCATCCAT CCAAATGGGA CCTGTGTGTG CTCCATCCAG 
ATTATCGTAA AAACTAAAAC TCTTAAAGAA ATCTTGGCAA CCTAGAGAGT AGCAACCAAC TAGGTAGGTA GGTTTACCCT GGACACACAC GAGGTAGGTC 

-2581 -2561 -2541 -2521 -2501 

GGCATATGAT CCCAAAGCCA AAAGAGTATT TCCAAGTGCT TTCTTTCTTT CTTTCTTTCT TTCTTACTAA CCTTTTTTTT TCTTATGCTT TAGACTAAGA 
CCGTATACTA GGGTTTCGGT TTTCTCATAA AGGTTCACGA AAGAAAGAAA GAAAGAAAGA AAGAATGATT GGAAAAAAAA AGAATACGAA ATCTGATTCT 
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-2A81 -2461 -2441 -2421 -2-301 

AATTTATTCG GCCATATCCA CTTTTACGAA TATACTTCTT ACAAGATCTA GATTTTTTTG AGTTAATTCG GTGTATATAA CATTGGCATG GACTGCAATT 
TTAAATAAGC CGGTATAGGT GAAAATGCTT ATATGAAGAA TGTTCTAGAT CTAAAAAAAC TCAATTAAGC CACATATATT GTAACCGTAC CTGACGTTAA 



-2381 -2361 -2341 -2321 -2301 

AAGTAATGGT AATGTGATCA TGATGCGATG TGTCGTTATC AGTAGTATAA TATTGATGGG CTACCCTGGA AAACAAAATT ACGTGTTATA TGTACACAAT 
TTCATTACCA TTACACTAGT ACTACGCTAC ACAGCAATAG TCATCATATT ATAACTACCC GATGGGACCT TTTGTTTTAA TGCACAATAT ACATGTGTTA 



-2281 -2261 -2241 -2221 -2201 

TTGGTAGAAC CGTAGAAATT AAACTGAATA AAACCTTCTA TAATGTTCAA AATTATATGG TACAGATTAA TACGGAAAAA CATTCACGCT TTACGTAACA 
AACCATCTTG GCATCTTTAA TTTGACTTAT TTTGGAAGAT ATTACAAGTT TTAATATACC ATGTCTAATT ATGCCTTTTT GTAAGTGCGA AATGCATTGT 

-2181 -2161 -2141 -2121 -2101 

ATTAAGTGGA AAGTAAAATT ATCCCAAAAA TATTTATATC ACATCATTGT TATATTTCTA AGTTTTTTTA TATCTCTAAT GGTATATGTT TTACAGATTG 
TAATTCACCT TTCATTTTAA TAGGGTTTTT ATAAATATAG TGTAGTAACA ATATAAAGAT TCAAAAAAAT ATAGAGATTA CCATATACAA AATGTCTAAC 



-2081 -2061 -2041 -2021 -2001 

TTTTTTGGGA AAATTCTTAA AGAGACTTGA AGAATGTTTT TTTTTTATTT TCTTGAAATG TTTGACACTT GAAACCGTTT AAAAACTCAA ATATAGTATA 
AAAAAACCCT TTTAAGAATT TCTCTGAACT TCTTACAAAA AAAAAATAAA AGAACTTTAC AAACTGTGAA CTTTGGCAAA TTTTTGAGTT TATATCATAT 



-1981 -1961 -1941 -1921 -1901 

TATCATTGTT GGTCTCATAC CTTGTAATTC ACCACATATA TTATCAATGG GGAAGATTTG AAAATTTTTG GGGGATCACA AAACGAAGGA AAGAGTACAA 
ATAGTAACAA CCAGAGTATG GAACATTAAG TGGTGTATAT AATAGTTACC CCTTCTAAAC TTTTAAAAAC CCCCTAGTGT TTTGCTTCCT TTCTCATGTT 



-1881 -1861 -1841 -1821 -1801 

AAAGAGAAGG AAAAGATAGA AGATATATGT TTTTAACTTC ATTGGTATGA CATCAATAAA TAAATAGTTG AATGTACTTT AGTTTCTCTT TTGGTTTAAT 
TTTCTCTTCC TTTTCTATCT TCTATATACA AAAATTGAAG TAACCATACT GTAGTTATTT ATTTATCAAC TTACATGAAA TCAAAGAGAA AACCAAATTA 



-1781 -1761 -1741 -1721 -1701 

GCACATCATC tcgatcaatt GTCATCATCT TACATTGAAT TATACGACCA GATCTGATAA CAAGTGAATT CGTACTTGCC CTTCCCTTTC TTCTCATACG 
CGTGTAGTAG AGCTAGTTAA CAGTAGTAGA ATGTAACTTA ATATGCTGGT CTAGACTATT G7TCACTTAA GCATGAACGG GAAGGGAAAG AAGAGTATGC 



-1681 -1661 -1641 -1621 -1601 

TCCTTCTAAC TAATTTTGAT TGTAACTTAT AATTATATAA CCATATTTAA TTTTATTTTA TCTAAAACCA ATTGAAGCAA ATTAAAATAT CATAAATCTT 
AGGAAGATTG ATTAAAACTA ACATTGAATA TTAATATATT GGTATAAATT AAAATAAAAT AGATTTTGGT TAACTTCGTT TAATTTTATA GTATTTAGAA 



-15B1 -1561 -1541 -1521 -1501 

GAGTCCCACA TGAAGACAAT ATATAAAACT CGTGCAAATT TGCTTAAAAT GCITCTATGA GACCATGACC AAGTGAGATT AATAAGCGAT TCAATGTGCA 
CTCAGGGTGT ACTTCTGTTA TATATTTTGA GCACGTTTAA ACGAATTTTA CGAAGATACT CTGGTACTGG TTCACTCTAA TTATTCGCTA AGTTACACGT 



-1481 -1461 -1441 -1421 -1401 

AATCAAAAGA GAAAAGAAGC TAATGGGTTT AAATATAACC AAACAGAATA ATAATGCTAT GTTTAGTTTT TCTAATTGAA TCATACCTTT GTGTC CATCA 
TTAGTTTTCT CTTTTCTTCG ATTAC CCAAA TTTATATTGG TTTGTCTTAT TATTACGATA CAAATCAAAA AGATTAACTT AGTATGGAAA CACAGGTAGT 

-1381 -1361 -1341 -1321 -1301 

CCTACTTACC GGTCAGAATA AAGCAATTAC GTCTGCAA CC AAAAAGCACT AAGACTTTCG GTCAGACATG ATCTCTAACA TCGGACGAAC CCTAAGATAA 
GGATGAATGG C CAGTCTTAT TTCGTTAATG CAGACGTTGG TTTTTCGTGA TTCTGAAAGC CAGTCTGTAC TAGAGATTGT AGCCTGCTTG GGATTCTATT 



-1281 -1261 -1241 -1221 -1201 

CCAAAATAAA CTATATCTTA TATTCAAATC TCTGTTTATT TTATCCATTT ATGTTTTCTT TCTTTCCCAT AATTTTTTTT GTGTCTCATC AGACTCTCTT 
GGTTTTATTT GATATAGAAT ATAAGTTTAG AGACAAATAA AATAGGTAAA TACAAAAGAA AGAAAGGGTA TTAAAAAAAA CACAGAGTAG TCTGAGAGAA 

-1181 -1161 -1141 -1121 -1101 

ACCAAACTGA ATTTATCAAC ATGGTTTTTT TTTTGGCCAC ATCAAAATGG TGGTTTATAA AGTAGACTAA TACAAAAGAC ATTTCTGTTA ATTTCACTAA 
TGGTTTGACT TAAATAGTTG TACCAAAAAA AAAACCGGTG TAGTTTTACC ACCAAATATT TCATCTGATT ATGTTTTCTG TAAAGACAAT TAAAGTGATT 



-1081 -1061 -1041 -1021 -1001 

CAAAAATAAT CTTAGCAGTA CTATAGATTG GAAAAGGAAA AGCAAATCTA GCAGTAAGAT TTATCAAAAC TAGCAGTAAG AGTTTTAGAT ATCATGAAAA 
GTTTTTATTA GAATCGTCAT GATATCTAAC CTTTTCCTTT TCGTTTAGAT CGTCATTCTA AATAGTTTTG ATCGTCATTC TCAAAATCTA TAGTACTTTT 



-981 -961 -941 -921 -901 

CATCACAAAC GAGTAGTGTT TTACTTTACA TTTTTAACCA ATCACAAGGG TAGTTCCGTA AGTTGGGAAA ATCGTACGAG GCTTCACCTA GTTAAGGTTA 
GTAGTGTTTG CTCATCACAA AATGAAATGT AAAAATTGGT TAGTGTTCCC ATCAAGGCAT TCAACCCTTT TAGCATGCTC CGAAGTGGAT caattccaat 



-881 -S61 -841 -621 -501 

GGTCACATGA TTCCCTGAAC TCGATTTTAT AAGTAAAAAA GAAAAATTTA TAAAATCAAA ATTTTTTATA TAAAAAAATC AGGTGGATTT ATCAGACCCT 
CCAGTGTACT AAGGGACTTG AGCTAAAATA TTCATTTTTT CTTTTTAAAT ATTTTAGTTT TAAAAAATAT ATTTTTTTAG TCCACCTAAA TAGTCTGGGA 
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-7B1 -761 -741 -721 -701 

ACCATCGAGA TGTCGACACG TGTCCAAACT CATTCATTGC CCTACTATTT TCTGTTTAGG GTTGCAATCA CTCATCGCAC ACGCGCCATC TCCACCTTCC 
TGGTAGCTCT ACAGCTGTGC ACAGGTTTGA GTAAGTAACG GGATGATAAA AGACAAATCC CAACGTTAGT GAGTAGCGTG TGCGCGGTAG AGGTGGAAGG 

-681 -661 -641 "621 -601 

ATTATTAATC TCTCATTTTC AACATCACAC TCTTACGAAT CATACGATTT TAATATCTCT GTCTCTCTCA ACGTATTAAA TAAAAATGGT TTTAAATGTT 
TAATAATTAG AGAGTAAAAG TTGTAGTGTG AGAATGCTTA GTATGCTAAA ATTATAGAGA CAGAGAGAGT TGCATAATTT ATTTTTACCA AAATTTACAA 

.581 -561 -541 -521 -501 

AGGGTTTTTT GTAGGATTTT CAATTATTAA TCTCTATAAT TCGATGAACT AAGTAAAAAA GCATCAAACT TTCTTGGCAG ATCACATTTT TCTCTAAACT 
TCCCAAAAAA CATCCTAAAA GTTAATAATT AGAGATATTA AGCTACTTGA TTCATTTTTT CGTAGTTTGA AAGAACCGTC TAGTGTAAAA AGAGATTTGA 

-481 -461 -441 -421 -401 

AAATATGGAC TGAAATTGAA AAATTAAACC ACTAGCTAGA ATAAAGTGTT GGTGAGAGTG GAACTCTAAT TTCTCTCCTT TACTAATTAT GTATAAACAC 
TTTATACCTG ACTTTAACTT TTTAATTTGG TGATCGATCT TATTTCACAA CCACTCTCAC CTTGAGATTA AAGAGAGGAA ATGATTAATA CATATTTGTG 

-381 -361 -341 -321 -301 

AAAAATGCAC CAAATTTTTA GGTTTGAAAA TATCTAAGCA TGGATAGGGT AATTAACATT TTTTCTTTCA ATTTTGCAAT ATTTGAATAA ATC CTATGAG 
TTTTTACGTG GTTTAAAAAT CCAAACTTTT ATAGATTCGT ACCTATCCCA TTAATTGTAA AAAAGAAAGT TAAAACGTTA TAAACTTATT TAGGATACTC 

-281 -261 -241 -221 -201 

GGTCTTTGGT ACACAATAAT TGGAGGGTAT ATAGTTGAGT CTGAGAGTAT ATTAGAAAGA GAATATTTCA AGTAATGAAG CTGACATGTT TATATGTACT 
CCAGAAACCA TGTGTTATTA ACCTCCCATA TATCAACTCA GACTCTCATA TAATCTTTCT CTTATAAAGT TCATTACTTC GACTGTACAA ATATACATGA 

-181 -161 -141 -121 -101 

TTGAGAGAAG TGTTGTGAGA TTTGTACAAA TGTATATGTA CACTTTAAAA AGCAATATAA GATAGATAAA AAAAATATAA AGAAAAAAAG AAAGAAAGAA 
AACTCTCTTC ACAACACTCT AAACATGTTT ACATATACAT GTGAAATTTT TCGTTATATT CTATCTATTT TTTTTATATT TCTTTTTTTC TTTCTTTCTT 

-81 -61 -41 " 21 _1 

AGAAAGAAAG AGAGAGGCTC ATATATATAT AGAATTGCTT GCAAGGAAAG AGAGAGAGAG AGATTGAGAT ATCTTTTGGG AGAGGAGAAA GAAAAAGAAA 
TCTTTCTTTC TCTCTCCGAG TATATATATA 7CTTAACGAA CGTTCCTTTC TCTCTCTCTC TCTAACTCTA TAGAAAACCC TCTCCTCTTT CTTTTTCTTT 

20 40 60 80 100 

ATGGGAAGAG GGAGAGTAGA ATTGAAGAGG ATAGAGAACA AGATCAATAG GCAAGTGACG TTTGCAAAGA GAAGGAATGG TCTTTTGAAG AAAGCATACG 
TACCCTTCTC CCTCTCATCT TAACTTCTCC TATCTCTTGT TCTAGTTATC CGTTCACTGC AAACGTTTCT CTTCCTTACC AGAAAACTTC TTTCGTATGC 

120 140 160 180 200 

AGCTTTCAGT TCTATGTGAT GCAGAAGTTG CTCTCATCAT CTTCTCAAAT AGAGGAAAGC TGTACGAGTT TTGCAGTAGT TCGAGGTATA TATCTACTTT 
TCGAAAGTCA AGATACACTA CGTCTTCAAC GAGAGTAGTA GAAGAGTTTA TCTCCTTTCG ACATGCTCAA AACGTCATCA AGCTCCATAT ATAGATGAAA 

220 240 260 280 300 

TGTATATATA TTACTTATAA CATAAAGATT TTATATACAT ATTAAGTAAC ACAAAAATGT CTTGTATGTA TGGGTCTCTC TGTGATGTGT TGTTGTGTCG 
ACATATATAT AATGAATATT GTATTTGTAA AATATATGTA TAATTCATTG TGTTTTTACA GAACATACAT ACCCAGAGAG ACACTACACA ACAACACAGC 

320 340 360 380 400 

TACGTACGTG TTCTATCATA TCCTTTTAAA AGAAGCAAAG AGGAAAAAAA ATTTGGGATA CCCCAAATCT GTATCATTTT ATAACAAGTT TGCTTTTTTG 
ATGCATGCAC AAGATAGTAT AGGAAAATTT TCTTCGTTTC TCCTTTTTTT TAAAC CCTAT GGGGTTTAGA CATAGTAAAA TATTGTTCAA ACGAAAAAAC 

420 440 460 480 500 

ATGTTCTTTT GTGTTTCTCT TTGATTTCCA TTTTTGTTTT TGATTTTTTT TCTATTTCTC TTTACATCTA TCAAAGTTTT TTTTCTTATA TTTTATTGCT 
TACAAGAAAA CACAAAGAGA AACTAAAGGT AAAAACAAAA ACTAAAAAAA AGATAAAGAG AAATGTAGAT AGTTTCAAAA AAAAGAATAT AAAATAACGA 

520 540 560 580 600 

TATTTGTTTG TCTACTTAAT TCACATTATC TGAGAGAAGA ACAATCTATC TGATATGAAA TTAGGGTTAA TTTCTCTTGT GAGTACTCTT TAATTCACAT 
ATAAACAAAC AGATGAATTA AGTGTAATAG ACTCTCTTCT TGTTAGATAG ACTATACTTT AATCCCAATT AAAGAGAACA CTCATGAGAA ATTAAGTGTA 

S20 640 660 680 700 

AAGCTTAAAG TTTCCACCTT TTGATTCTGG GGGTCGTCCA ATTCGATCAA ATCACTCAAT TTTGTTGTCA GATTGATATA AGTTCATAGG GGGATATTGT 
TTCGAATTTC AAAGGTGGAA AACTAAGACC CCCAGCAGGT TAAGCTAGTT TAGTGAGTTA AAACAACAGT CTAACTATAT TCAAGTATCC CCCTATAACA 

720 740 760 780 800 

TTCCACGACA ATC CATTTTA GTAACCCTTA GGGGTTTCCA ATTTTGGGTT TTGAATTGAC GCTAATGTCA AATTCATCTA AAGTCCGTTG GATATGTATA 
AAGGTGCTGT TAGGTAAAAT CATTGGGAAT CCCCAAAGGT TAAAACCCAA AACTTAACTG CGATTACAGT TTAAGTAGAT TTCAGGCAAC CTATACATAT 

820 640 B60 880 900 

CTTGGGGATG GGATTCATCC TTTTTTCTGG GTTCTTTAGA TCTTCTCTTA AAAGACTAAC AGATTTTGTT GTAAAC CCTA GGAAACAGTT AAAAATCCCA 
GAACCCCTAC CCTAAGTAGG AAAAAAGACC CAAGAAATCT AGAAGAGAAT TTTCTGATTG TCTAAAACAA CATTTGGGAT CCTTTGTCAA TTTTTAGGGT 
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920 940 960 980 1000 

TTTTTAAAAA CATGTTTTGA ACTTGATGAG TAAGATTAAT GGAAGAAATG ATGTTTTTGT GTGGTGTGAA GCATGCTTCG GACACTGGAG AGGTACCAAA 
AAAAATTTTT GTACAAAACT TGAACTACTC ATTCTAATTA CCTTCTTTAC TACAAAAACA CACCACACTT CGTACGAAGC CTGTGACCTC TCCATGGTTT 

1020 1040 1060 1080 ■ 1100 

AGTGTAACTA TGGAGCACCA GAACCCAATG TGCCTTCAAG AGAGGCCTTA GCAGTTGTAC CCAATTCTCT TCTCTTTCTT CTAATTACCT TAATTAATTA 
TCACATTGAT ACCTCGTGCT CTTGGGTTAC ACGGAAGTTC TCTCCGGAAT CGTCAACATG GGTTAAGAGA AGAGAAAGAA GATTAATGGA ATTAATTAAT 

1120 1140 1160 11B0 1200 

CTCTCAATTT TTACTTTGAT TTTTAGAGTC AAATGATTAA TGTTATAATT TGTCATATAC TTCAGGAACT TAGTAGCCAG CAGGAGTATC TCAAGCTTAA 
GAGAGTTAAA AATGAAACTA AAAATCTCAG TTTACTAATT ACAATATTAA ACAGTATATG AAGTCCTTGA ATCATCGGTC GTCCTCATAG AGTTCGAATT 

1220 1240 1260 1280 1300 

GGAGCGTTAT GACGCCTTAC AGAGAACCCA AAGGTAAACT AATTAGCTTC TTCAGCTACC TTCAGAGAGT GTTTGTTTTT TTAGTAGATT TTTTTGATGG 
CCTCGCAATA CTGCGGAATG TCTCTTGGGT TTCCATTTGA TTAATCGAAG AAGTCGATGG AAGTCTCTCA CAAACAAAAA AATCATCTAA AAAAACTACC 

1320 1340 1360 1360 1400 

TTTTGATGTT GAAATAGGAA TCTGTTGGGA GAAGATCTTG GACCTCTAAG TACAAAGGAG CTTGAGTCAC TTGAGAGACA GCTTGATTCT TCCTTGAAGC 
AAAACTACAA CTTTATCCTT AGACAACCCT CTTCTAGAAC CTGGAGATTC ATGTTTCCTC GAACTCAGTG AACTCTCTGT CGAACTAAGA AGGAACTTCG 

1420 1440 1460 1480 1500 

AGATCAGAGC TCTCAGGGTA CTACTTTGTT CATCAATATC TTTATACACT GATCTATTTC CATAGTAAGA TTAAATTTGG TGTTTAATTC TGCAGACACA 
TCTAGTCTCG AGAGTCCCAT GATGAAACAA GTAGTTATAG AAATATGTGA CTAGATAAAG GTATCATTCT AATTTAAACC ACAAATTAAG ACGTCTGTGT 

UJ 1520 1540 1560 1580 1600 

fi\ GTTTATGCTT GACCAGCTCA ACGATCTTCA GAGTAAGGTA AATAAAGAAA CACTCATTCT CCTCTCTAAA TTCCTCATCT AAAAGTAATG TAACCAAGAA 

jf| CAAATACGAA CTGGTCGAGT TGCTAGAAGT CTCATTCCAT TTATTTCTTT GTGAGTAAGA GGAGAGATTT AAGGAGTAGA TTTTCATTAC ATTGGTTCTT 

1620 1640 1660 1680 1700 

~r~ AACACAAATA TTTGGAGCAG GAACGCATGC TGACTGAGAC AAATAAAACT CTAAGACTAA GGGTAATTAA TATACATTCT CATATCACCA AATTAATGCA 

111 TTGTGTTTAT AAACCTCGTC CTTGCGTACG ACTGACTCTG TTTATTTTGA GATTCTGATT CCCATTAATT ATATGTAAGA GTATAGTGGT TTAATTACGT 
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1720 1740 1760 17B0 1800 

TCACTAAATT TGGTTATAAT GTGTGTGTGT ATATACATAT GTGACAGTTA GCTGATGGGT ATCAGATGCC ACTCCAGCTG AACCCTAACC AAGAAGAGGT 
AGTGATTTAA ACCAATATTA CACACACACA TATATGTATA CACTGTCAAT CGACTACCCA TAGTCTACGG TGAGGTCGAC TTGGGATTGG TTCTTCTCCA 

1820 1840 1860 1880 1500 

TGATCACTAC GGTCGTCATC ATCATCAACA ACAACAACAC TCCCAAGCTT TCTTCCAGCC TTTGGAATGT GAACCCATTC TTCAGATCGG GTAACTTTAG 
ACTAGTGATG CCAGCAGTAG TAGTAGTTGT TGTTGTTGTG AGGGTTCGAA AGAAGGTCGG AAACCTTACA CTTGGGTAAG AAGTCTAGCC CATTGAAATC 

1920 1940 1960 1980 2000 

ACTAGTATAA CCAATTTGAT TTGAGTTCTA TTATAAGCTT TTCTTAAGAA AGTATCTCAA ACTACTAAAT TTTATGGAGC AGGTATCAGG GGCAACAAGA 
TGATCATATT GGTTAAACTA AACTCAAGAT AATATTCGAA AAGAATTCTT TCATAGAGTT TGATGATTTA AAATACCTCG TCCATAGTCC CCGTTGTTCT 

2020 2040 2060 

TGGAATGGGA GCAGGACCAA GTGTGAATAA TTACATGTTG GGTTGGTTAC CTTATGACAC CAACTCTATT 
ACCTTACCCT CGTCCTGGTT CACACTTATT AATGTACAAC CCAACCAATG GAATACTGTG GTTGAGATAA 



50 SEQ ID NO: 44 

Arabidopsis AGL2 0 genomic sequence 



-2981 -2961 -2941 -2921 -2901 

GAAAAAAAAA ACACCTAAAG AAGTGAATAT AATAGGCATA TACATATGAG GAAAATGAAA ACAAAAGGAG CGAAAAATAG ATTTAACCTA AAAGAGGAAG 
CiT T VriTl ' f TGTGGATTTC TTCACTTATA TTATCCGTAT ATGTATACTC CTTTTACTTT TGTTTTCCTC GCTTTTTATC TAAATTGGAT TTTCTCCTTC 

-2881 -2861 -2341 -2821 -2801 

TAAAGAGGTT ATAAGAGGTA AGAAAAGTAG GACCATATAA TAGCTATATT GTAGAATTTT ATTATTTGGA GATATGGCAA TTTTTGTGAG GGTCCCATGA 
ATTTCTCCAA T ATTCTC CAT TCTTTTCATC CTGGTATATT ATCGATATAA CATCTTAAAA TAATAAACCT CTATAC CGTT AAAAACACTC CCAGGGTACT 

-2781 -2761 -2741 -2721 -2701 

AGACTAAAGT GTGGAGCACG ATTTATCTTT GTAATTAATA AAATAATAAA TATATTATTA TTGTCTCGGG ATTTTTCGAT TGATGAGAAA AAGTAAGAGG 
TCTGATTTCA CACCTCGTGC TAAATAGAAA CATTAA TTAT TTTATTATTT ATATAATAAT AACAGAGCCC TAAAAAGCTA ACTACTCTTT TTCATTCTCC 

-2681 -2661 -2641 -2621 -2601 

TGCGTTTTCG AATTATCATT GGCTAACGTT TGTACGTGAC TGTACGGACG ACGTTGATGT ATTTCTAATA TTGTACTCTT TTTTCCCACC CTTATTTCTC 
ACGCAAAAGC TTAATAGTAA CCGATTGCAA ACATGCACTG ACATGCCTGC TGCAACTACA TAAAGATTAT AACATGAGAA AAAAGGGTGG GAATAAAGAG 

-2581 -2561 -2541 -2521 -2501 

TAATTCTTGT ACATTAACCC CAAACTAATT TTACAAACAC ATTGGTGTTT AATCATTGTG AAATTTTGAT TTATCTAAAA TACACTTTAT ATGTTATGAT 
ATTAAGAACA TGTAATTGGG GTTTGATTAA AATGTTTGTG TAACCACAAA TTAGTAACAC TTTAAAACTA AATAGATTTT ATGTGAAATA TACAATACTA 

-2481 -2461 -2441 -2421 -2401 

TTTGCATGAG CTTATGACTG GTAAACTCAT GAGATTTCCA TATCACCATG TTGGAAGTTA CTAACCATAC ATCTTTTAAA TGCAAATTCA CATCATTCCT 
AAACGTACTC GAATACTGAC CATTTGAGTA CTCTAAAGGT ATAGTGGTAC AACCTTCAAT GATTGGTATG TAGAAAATTT ACGTTTAAGT GTAGTAAGGA 
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-2381 -2361 -2341 -2321 -2301 

AGACTGCTAG ACAGACATGT ACTTACTTAT ACAAGGTTTT TCTAATCTAA TGGCAACAAA GAAACTTGTG ACTAAACGCA TACGTATCTC TCATATAGTG 
TCTGACGATC TGTCTGTACA TGAATGAATA TGTTCCAAAA AGATTAGATT ACCGTTGTTT CTTTGAACAC TGATTTGCGT ATGCATAGAG AGTATATCAC 

-2281 -2261 -2241 -2221 -2201 

TAGACTAGAA CTCTACGTAT CTCTCATATA GTCATATTTT TAAAAAAATT ATACTTTGGG ATCTCGAAGC GAAAA TTAGA TTAGTTTATA TGATTATGTA 
ATCTGATCTT GAGATGCATA GAGAGTATAT CAGTATAAAA ATTTTTTTAA TATGAAACCC TAGAGCTTCG CTTTTAATCT AATCAAATAT ACTAATACAT 

•2181 -2161 -2141 -2121 -2101 

CAAAAAAAAT CGGATATTAC TACCACTTAA AAATAATTGT AGTGGTCAAT CATATCTAAA ATTAATCGCA GTGAACAAAA ACCTGAAGCA TAGCCTGGTT 
GTTTTTTTTA GCCTATAATG ATGGTGAATT TTTATTAACA TCACCAGTTA GTATAGATTT TAATTAGCGT CACTTGTTTT TGGACTTCGT ATCGGAC CAA 

-2081 -2061 -2041 -2021 -2001 

CTATCTTACT TTCGATGTGA CACATTACTA ACACGATTGT TTTAATCTAT AGGACGAATC CTTTAAGTAA TGTATAGTTG GTTCAGTTAC GTTAGATACT 
GATAGAATGA AAGCTACACT GTGTAATGAT TGTGCTAACA AAATTAGATA TCCTGCTTAG GAAATTCATT ACATATCAAC CAAGTCAATG CAATCTATGA 

-1981 -1961 -1941 -1921 -1901 

TTTTGTTTTG GATTTGTCTC AAC CAGTTAA GAAGTGATCG TATTTACTAG TGGTATACGA TGATGTTTCT TTAAATCTGA ATTGGGTCTA CAAAATACAT 
AAAACAAAAC CTAAACAGAG TTGGTCAATT CTTCACTAGC ATAAATGATC AC CATATGCT ACTACAAAGA AATTTAGACT TAACCCAGAT GTTTTATGTA 

-1881 -1861 -1841 -1821 -1801 

AACTAAACTT CAAACCGGGT TTATACTTTA TACAAACACG AAAATATAAA GATAGAGACA ATTCACCAGA GAAGATGTGT ATTTATATAA AAATTATCCA 
TTGATTTGAA GTTTGGCCCA AATATGAAAT ATGTTTGTGC TTTTATATTT CTATCTCTGT TAAGTGGTCT CTTCTACACA TAAATATATT TTTAATAGGT 

-1761 -1761 -1741 -1721 -1701 

TACAGATTTT CGGACCTATC TGTTTGATAT TTAATATATA TAAATACGTT AACATATTTC ACCAGAGAAG ATGTGTATTT TTC GAAA TAA TTAGTTTGTG 
ATGTCTAAAA GCCTGGATAG ACAAACTATA AATTATATAT ATTTATGCAA TTGTATAAAG TGGTCTCTTC TACACATAAA AAGCTTTATT AATCAAACAC 

-1681 -1661 -1641 li 621 -1601 

TGGTCCTCCT CCCGATATAG ATAAAAGATC ATTAGATATC GATTAACAAT TTTATCTCCA AAAAAGGATA TTTITTTGGT GCCACTAGCT AGACAAGACG 
ACCAGGAGGA GGGCTATATC TATTTTCTAG TAATCTATAG CTAATTGTTA AAATAGAGGT TTTTTCCTAT AAAAAAACCA CGGTGATCGA TCTGTTCTGC 

-1581 -1561 -1541 -1521 -1501 

TTCGATAAGC TGAATTATTA TTGGATTTCT AAGTTACGTT TTCTTTAGTA ATCCGAGGGA CCAAAAATAG CAAATGCCTC TTTAGACACG TCGCTACTTA 
AAGCTATTCG ACTTAATAAT AACCTAAAGA TTCAATGCAA AAGAAATCAT TAGGCTCCCT GGTTTTTATC GTTTACGGAG AAATCTGTGC AGCGATGAAT 

-1481 -1461 -1441 -1421 -1-4 01 

ACGCCATTGC CCCATTGTCT CTGTACTAGC CTCCAAATAT TTGGATTAAT GGTCACTTAG GTAATGAGGA AATTGTAGTA TTTTGTAATG TGGTTTTGTC 
TGCGGTAACG GGGTAACAGA GACATGATCG GAGGTTTATA AACCTAATTA CCAGTGAATC CATTACTCCT TTAACATCAT AAAACATTAC ACCAAAACAG 

-1381 -1361 -1341 -1321 -1301 

CAACTTATAA AAACTTACAA TTGCAAGTAA TTAATTATTC ACATGGAGAT GTAAGATTAT GTCATATAAC TAAAAACACA ATTTAAGAAC AACAATAAGA 
GTTGAATATT TTTGAATGTT AACGTTCATT AATTAATAAG TGTAC CTCTA CATTCTAATA CAGTATATTG ATTTTTGTGT TAAATTCTTG TTGTTATTCT 

-1281 -1261 -1241 -1221 -1201 

AACAATGGAC AAACAAGCAT AGAAAATATA CAAATCAAAT GAATTTTATC TGTTGGGATG GAAAGATATT AT AAAAA TTG ATTAAAACCA ATATAGTTGT 
TTGTTACCTG TTTGTTCGTA TCTTTTATAT GTTTAGTTTA CTTAAAATAG ACAACCCTAC CTTTCTATAA TATTTTTAAC TAATTTTGGT TATATCAACA 

-1181 -1161 -1141 -1121 -1101 

ATTACTCACA GGTAAGAAAA AACGATATTC TTATTTTTCA TATCAATTAC AAGTGGGGGC ATATAGGTAC GAGAGAGTGT TTGTGTCCAC ATTAAAAACA 
TAATGAGTGT CCATTCTTTT TTGCTATAAG AATAAAAAGT ATAGTTAATG TTCACCCCCG TATATCCATG CTCTCTCACA AACACAGGTG TAATTTTTGT 

-1081 -1061 -1041 -1021 -1001 

AAAAAAGATT TTTGTTAGAA GAAATTTAAT AAAAATAATT TGACAGGCAT TTCCATCCAA CTAGATATTT ATGGGAGGGA AAAA GATGTG TATGT AAAAA 
TTTTTTCTAA AAACAATCTT CTTTAAATTA TTTTTATTAA ACTGTCCGTA AAGGTAGGTT GATCTATAAA TACCCTCCCT TTTTCTACAC ATACATTTTT 

-981 -961 -941 -921 -901 

TGTCCATATG TATCAAAATA TGCTATTTTT GGTCTTTCTT AAGGCTTTTT TCCAAAATAA GTAAAGGATG AGGTTTCAAG CGTCCATCAT ATTTGCGACA 
ACAGGTATAC ATAGTTTTAT ACGATAAAAA CCAGAAAGAA TTCCGAAAAA AGGTTTTATT CATTTC CTAC TCCAAAGTTC GCAGGTAGTA TAAACGCTGT 

-881 -861 -841 -821 -801 

CATATGACTG ACTATTTAGC TCCTCCCTCT TTCTTTCTCT TATTTTATTA TCTTTCTCCA AGAAATAAAA TAGAAAAGAA AATATATATG GTTTCACAAA 
GTATACTGAC TGATAAATCG AGGAGGGAGA AAGAAAGAGA ATAAAATAAT AGAAAGAGGT TCTTTATTTT ATCTTTTCTT TTATATATAC CAAAGTGTTT 

-781 -761 -741 -721 -701 

CACCATTACC ATAACTACAA CGAGAAGAGG ATCTTTTTTA AGGAGAAAAG CAGAGAGAGA AGAGACGAGT GTGTGAAGTT TTTTTGTCTT TTGTTTCTTT 
GTGGTAATGG TATTGATGTT GCTCTTCTCC TAGAAAAAAT TCCTCTTTTC GTCTCTCTCT TCTCTGCTCA CACACTTCAA AAAAACAGAA AACAAAGAAA 

-681 -661 -641 -621 -601 

TATTACACAC AAATAGATGA AACGAGGAAA GCTACTTCTT TTGCTACTTC CATAAAAAGG TTCTTCCTTT CGCAGAGAAT CAACTTTGAT CATCTTCTTC 
ATAATGTGTG TTTATCTACT TTGCTCCTTT CGATGAAGAA AACGATGAAG GTATTTTTCC AAGAAGGAAA GCGTCTCTTA GTTGAAACTA GTAGAAGAAG 

-581 -561 -541 -521 -501 

CTTCTCTTTC TTTCTTCTTC TCCCTCCAGT AATGCTTATA TAGTCTCCTC CTATATCTCT AC CTATACAT ACACAAACCC TTTATCCTCG AAAGCTTCCT 
GAAGAGAAAG AAAGAAGAAG AGGGAGGTCA TTACGAATAT ATCAGAGGAG GATATAGAGA TGGATATGTA TGTGTTTGGG AAATAGGAGC TTTCGAAGGA 

-481 -461 -441 -421 -401 

CCTGGTTAGG TTTTTATCAA ACCCTTTTAG C CAATCGGTA AGATCTCTTC GTCATGATCT TTTCTTTTTT CTTTTGCTTT GTACTCTGAT GGATCTATAA 
GGACCAATCC AAAAATAGTT TGGGAAAATC GGTTAGCCAT TCTAGAGAAG CAGTACTAGA AAAGAAAAAA GAAAACGAAA CATGAGACTA CCTAGATATT 

-3B1 -361 -341 -321 -301 

ACTTATATGG GTTTGGTTTC ATTTGGTTCG ATTTGATGTG TTTGGTTTCT TTGTCCTAAA TCTCATGAAA GGAGGTTGCA TCCTTCAATT AAACCGATAA 
TGAATATACC CAAACCAAAG TAAACCAAGC TAAACTACAC AAACCAAAGA AACAGGATTT AGAGTACTTT CCTCCAACGT AGGAAGTTAA TTTGGCTATT 

-281 -261 -241 -221 -201 

CAAAAGTTTC CATTACAGAC TTATAGATCA GATACTTTAG ATTGTTTTGC TTTTTGGGTA CTTAATCTTT CGTTGACTTC ATCAGTCTTC TCCCACCCAA 
GTTTTCAAAG GTAATGTCTG AATATCTAGT CTATGAAATC TAACAAAACG AAAAACCCAT GAATTAGAAA GCAACTGAAG TAGTCAGAAG AGGGTGGGTT 

-181 -161 -141 -121 -101 

ACAAAAAAGT CATATTTCGA TCATATCTTC ATTTTTTTAA CCTACTCTCT TTGATTCATA TATGAAATGG GTTGTTTTAT GTGTGTGACT AATCTTGTTA 
TGTTTTTTCA GTATAAAGCT AGTATAGAAG TAAAAAAATT GGATGAGAGA AACTAAGTAT ATACTTTACC CAACAAAATA CACACACTGA TTAGAACAAT 

-81 -61 -41 -21 -1 

TTGAGGTGGT TGCACCATTG ATCTACCGTT TTCTTCAATT TTTGAAAAAA TAATTTTATT TTTTTTCTGT GTGCAAGGGA AATTAACTAA AGAAGAAGAT 
AACTCCACCA ACGTGGTAAC TAGATGGCAA AAGAAGTTAA AAACTTTTTT ATTAAAATAA AAAAAAGACA CACGTTCCCT TTAATTGATT TCTTCTTCTA 

20 40 60 80 100 

ATGGTGAGGG GCAAAACTCA GATGAAGAGA ATAGAGAATG CAACAAGCAG ACAAGTGACT TTCTCCAAAA GAAGGAATGG TTTGTTGAAG AAAGCCTTTG 
TACCACTCCC CGTTTTGAGT CTACTTCTCT TATCTCTTAC GTTGTTCGTC TGTTCACTGA AAGAGGTTTT CTTCCTTACC AAACAACTTC TTTCGGAAAC 

120 140 160 180 200 

AGCTCTCAGT GCTTTGTGAT GCTGAAGTTT CTCTTATCAT CTTCTCTCCT AAAGGCAAAC TTTATGAATT CGCCAGCTCC AAGTACGTTC TTTTTGTCTT 
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TCGAGAGTCA CGAAACACTA CGACTTCAAA GAGAATAGTA GAAGAGAGGA TTTCCGTTTG AAATACTTAA GCGGTCGAGG TTCATGCAAG AAAAACAGAA 



220 240 260 280 300 

TCTTACAAAT CATCCATAGA AAGAGAGAGA GAGAGAGATC TCATTAACCT CTCTATTTGT ATCTTAATTT TTTTTGGTTT ATATATGGAT TTGATTGGCC 
AGAATGTTTA GTAGGTATCT TTCTCTCTCT CTCTCTCTAG AGTAATTGGA GAGATAAACA TAGAATTAAA AAAAACCAAA TATATACCTA AACTAACCGG 

3 20 340 360 380 400 

TTTTGTGGAA TCACATCTCT TTGACGTTTG CTTTGAGAGG TGTGTTTAAA TGAGTTTCTT GGTTTCTGCA AAATTAGGGC TATTATTAAA GTAGTATCAA 
AAAACACCTT AGTGTAGAGA AACTGCAAAC GAAACTCTCC ACACAAATTT ACTCAAAGAA CCAAAGACGT TTTAATCCCG ATAATAATTT CATCATAGTT 

420 440 460 480 500 

GTACATATAC CCTCTTATTT ATTGTTTTTT TATTTCCGCT AGTATATCAT CTTGTTTAAT CATCTGTCTC TCTCTTTCTC AATTAGTTTC TCAAGTTATG 
CATGTATATG GGAGAATAAA TAACAAAAAA ATAAAGGCGA TCATATAGTA GAACAAATTA GTAGACAGAG AGAGAAAGAG TTAATCAAAG AGTTCAATAC 

520 540 560 580 600 

ATATAAATAA AATGTGCTCT TTCGTAGCCA ATTTACACTT GTTATATATT TGATCTTCTT AGAGATCATG ATCACATAGT ATTAATAAAA CAACTTTCAA 
TATATTTATT TTACACGAGA AAGCATCGGT TAAATGTGAA CAATATATAA ACTAGAAGAA TCTCTAGTAC TAGTGTATCA TAATTATTTT GTTGAAAGTT 

620 640 660 680 700 

TTAGTATTCT TTTGGTTTGA ACTAATCTTT GTCTTGTTAT TGCTTTAAGC AAAACATGTT GTTCTAATTT CTAAGTGATG ATTAGGAAGT TGTTTCATCA 
AATCATAAGA AAACCAAACT TGATTAGAAA CAGAACAATA ACGAAATTCG TTTTGTACAA CAAGATTAAA GATTCACTAC TAATCCTTCA ACAAAGTAGT 

720 740 760 780 BOO 

TTCCTGATTT ATTAATCCCT CATGCTTCAT TTCATGCTCA TTCCTAATTT AGTTCAATTT GTTTGAATAT TTGTTCCTGA TTTTGACATA GAAACTCAAA 
AAGGACTAAA TAATTAGGGA GTACGAAGTA AAGTACGAGT AAGGATTAAA TCAAGTTAAA CAAACTTATA AACAAGGACT AAAACTGTAT CTTTGAGTTT 

820 840 860 880 900 

GCTAGCTAGC CAAACCTAAA TGTTGATTGT TTTTGAGAAT CAAAAGAGTT TTATCTTGTA CTGTTAGGTA GTAGGGAAAC CAAACTTACT nTGATGAAT 
CGATCGATCG GTTTGGATTT ACAACTAACA AAAACTCTTA GTTTTCTCAA AATAGAACAT GACAATCCAT CATCCCTTTG GTTTGAATGA AAACTACTTA 

920 940 960 980 1000 

CATTACTTCT GTAAATGAAA ATGCCAGCTT TTGATCAGAT GTTTCAGACA TTTGGTCCAT TTGGGAAAGT ACTTCTTTCT CTCGAACCTA CTAAATATAA 
GTAATGAAGA CATTTACTTT TACGGTCGAA AACTAGTCTA CAAAGTCTGT AAACCAGGTA AACCCTTTCA TGAAGAAAGA GAGCTTGGAT GATTTATATT 

10 20 1040 1060 1080 1100 

AGATAAGACC TCACATGTTT TTGATTTTCT AAAATAGGGG GAAAAAGTAC AAGACTTTTC AAGCTATGTC CTTGATTAAG TCTAGTGATA TCTTCAATAA 
TCTATTCTGG AGTGTACAAA AACTAAAAGA TTTTATCCCC CTTTTTCATG TTCTGAAAAG TTCGATACAG GAACTAATTC AGATCACTAT AGAAGTTATT 

1120 1140 1160 118 0 1200 

GAAATGTTTT GAGAACACCA TTGGGATCTA AATTTGATCT CTGATGATTT ACTTTAATGT TCCAATTATA TATGTTTTTG ACAGTATGCA AGATACCATA 
CTTTACAAAA CTCTTGTGGT AAC CCTAGAT TTAAACTAGA GACTACTAAA TGAAATTACA AGGTTAATAT ATACAAAAAC TGTCATACGT TCTATGGTAT 

1220 1240 1260 1280 1300 

GATCGTTATC TGAGGCATAC TAAGGATCGA GTCAGCACCA AACCGGTTTC TGAAGAAAAT ATGCAGGTTT ATTCTTTATG ATCTTCTTGC CTATATATCA 
CTAGCAATAG ACTCCGTATG ATTCCTAGCT CAGTCGTGGT TTGGCCAAAG ACTTCTTTTA TACGTCCAAA TAAGAAATAC TAGAAGAACG GATATATAGT 

1320 1340 1360 1380 1400 

ATTCTTGCTA ATTAATACTT TTACTATATA ATATCAAAGA GCGGTAATGA ATATAACCAC AATATGTATA TAATCTCAAG GTCACAGGAT CAAGTCACAT 
TAAGAACGAT TAATTATGAA AATGATATAT TATAGTTTCT CGCCATTACT TATATTGGTG TTATACATAT ATTAGAGTTC CAGTGTCCTA GTTCAGTGTA 

1420 1440 14S0 1480 1500 

ATTTATAATT AGGATATATA TGTACATGCA ATAACATTTC TGTGATATAA CCAACAGCAT TTGAAATATG AAGCAGCAAA CATGATGAAG AAAA TTGAAC 
TAAATATTAA TCCTATATAT ACATGTACGT TATTGTAAAG ACACTATATT GGTTGTCGTA AACTTTATAC TTCGTCGTTT GTACTACTTC TTTTAACTTG 

1520 1540 1560 1580 1600 

AACTCGAAGC TTCTAAACGG TTTGTGATAT ATACATATAT ACAAACACAT TATTCATCAC TTGTATATAT CTATTTCATG ATGCATAGGA GAGTTTGATC 
TTGAGCTTCG AAGATTTGCC AAACACTATA TATGTATATA TGTTTGTGTA ATAAGTAGTG AACATATATA GATAAAGTAC TACGTATCCT CTCAAACTAG 

1620 1640 1660 1680 1700 

AATTAGTGTT TTG T1 TTTGT AATCAGTAAA CTCTTGGGAG AAGGCATAGG AACATGCTCA ATCGAGGAGC TGCAACAGAT TGAGCAACAG CTTGAGAAAA 
TTAATCACAA AACAAAAACA TTAGTCATTT GAGAACCCTC TTCCGTATCC TTGTACGAGT TAGCTCCTCG ACGTTGTCTA ACTCGTTGTC GAACTCTTTT 

1720 1740 1760 1780 1800 

GTGTCAAATG TATTCGAGCA AGAAAGGTAT GTGTATATAT TTATCTGTTA TATCTCCACA TTATAAGTAT TGTTCGAATC ATCTTCTGAA ACCACTCATA 
CACAGTTTAC ATAAGCTCGT TCTTTCCATA CACATATATA AATAGACAAT ATAGAGGTGT AATATTCATA ACAAGCTTAG TAGAAGACTT TGGTGAGTAT 

1S20 1840 I860 1880 1900 

ATTATAACTC AATTTCTCAT CTCTTTTAGA CTCAAGTGTT TAAGGAACAA ATTGAGCAGC TCAAGCAAAA GGTAAAGTAG TTTTTATGAG TGTATATAAA 
TAATATTGAG TTAAAGAGTA GAGAAAATCT GAGTTCACAA ATTCCTTGTT TAACTCGTCG AGTTCGTTTT C CATTTCATC AAAAATACTC ACATATATTT 

1920 1940 I960 1980 2000 

CAGATATAAG TATGTATGCA AATTGTGTAA TATTCCAAGT AAGTAAGCCT CTTGTGCTTG CTTTTTACAA ATTGGAATCT AAAACTTTTG CAGGA GAAA G 
GTCTATATTC ATACATACGT TTAACACATT ATAAGGTTCA TTCATTCGGA GAACACGAAC GAAAAATGTT TAACCTTAGA TTTTGAAAAC GTCCTCTTTC 

2020 2040 2060 2080 2100 

CTCTAGCTGC AGAAAACGAG AAGCTCTCTG AAAAGGTATA ATATATTCTT ATGGGTCTCA AGTTAGGGTT GCACATTCGT TTTTTTATTC GGTAAAGATA 
GAGATCGACG TCTTTTGCTC TTCGAGAGAC TTTTCCATAT TATATAAGAA TACCCAGAGT TCAATCCCAA CGTGTAAGCA AAAAAATAAG CCATTTCTAT 

2120 2140 2160 2180 2200 

AGAAAGTTGG GGTTCTTTTT GGGGGTTATT AGGTTAGGAG AGTCCTTACT AGTTTTTCTT GGTTATCTTC AATCATCAAC CTTCTTTAAT TTATGTATTG 
TCTTTCAACC CCAAGAAAAA CCCCCAATAA TCCAATCCTC TCAGGAATGA TCAAAAAGAA CCAATAGAAG TTAGTAGTTG GAAGAAATTA AATACATAAC 

2220 2240 2260 2280 2300 

TTCTATATAT CTTCTAATTT GCATCTATTA ATTTTGTGTA ATAATTCTAT TTGAATGCAG TGGGGATCTC ATGAAAGCGA AGTTTGGTCA AATAAGAATC 
AAGATATATA CAAGATTAAA CGTAGATAAT TAAAACACAT TATTAAGATA AACTTACGTC AC C C CTAGAG TACTTTCGCT TCAAACCAGT TTATTCTTAG 

2320 2340 2360 2380 

AAGAAAGTAC TGGAAGAGGT GATGAAGAGA GTAGCCCAAG TTCTGAAGTA GAGACGCAAT TGTTCATTGG GTTAC CTTGT TCTTCAAGAA AG 
TTCTTTCATG ACCTTCTCCA CTACTTCTCT CATCGGGTTC AAGACTTCAT CTCTGCGTTA ACAAGTAACC CAATGGAACA AGAAGTTCTT TC 



SEQ ID NO: 45 

Arabidopsis AGL22 genomic sequence 

-3981 *2961 -2941 -2921 -2901 

TACAAGTCAT CGCCGCCGTC GTCATTTTCA GGATCCGGCG AGAAACTGAA CCAAAATAAT ACTTATTTTA CTCGTAAGGA AAATTTGGGC CTAATAAAAG 
ATGTTCAGTA GCGGCGGCAG CAGTAAAAGT CCTAGGCCGC TCTTTGACTT GGTTTTATTA TGAATAAAAT GAGCATTCCT TTTAAACCCG GATTATTTTC 

-2881 -2861 -2B41 -2821 -2801 

CCCAATAATA ATAAAAAGCC CATTAGGGAC TCCGCTTTAT GATAACGGTG ACTGTAGTTT CCTTGATGTG TCAGAGAGAG TGTGTAGTGT AGGGACTGTG 
GGGTTATTAT TATTTTTCGG GTAATCCCTG AGGCGAAATA CTATTGCCAC TGACATCAAA GGAACTACAC AGTCTCTCTC ACACATCACA TCCCTGACAC 
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-2781 -2761 -2741 -2721 -2701 

TAGAAAGAAA GAAGCCTAAA ATGGCTAAAA GGTTAGGTGC AATGTTTCAT TAGAGAGGCT TGGAACTGTT AAGGGAAAGG TCACGAGTCG TCTACTCATA 
ATCTTTCTTT CTTCGGATTT TACCGATTTT CCAATCCACG TTACAAAGTA ATCTCTCCGA ACCTTGACAA TTCCCTTTCC AGTGCTCAGC AGATGAGTAT 

-2681 -2661 -2641 -2621 -2601 

AAAACTCTGA CACTTTGACC AATCAAAACT CAAAGACCTC ACCAGTTGTG TCACGTGCGC CTCTAAACAC TATTCAATTT CAAATATAAA TGATTCATGC 
TTTTGAGACT GTGAAACTGG TTAGTTTTGA GTTTCTGGAG TGGTCAACAC AGTGCACGCG GAGATTTGTG ATAAGTTAAA GTTTATATTT ACTAAGTACG 

-2581 -2561 -2541 -2521 -2S01 

GGTTCCAAAC GCCAATTGAT GGATGTTCTA CCAAATTTAA TCTACTTTTA CCAAACCATG ACAAATATGA ATAAACATTA CTTGATAATA ATTTTGTGAG 
CCAAGGTTTG CGGTTAACTA CCTACAAGAT GGTTTAAATT AGATGAAAAT GGTTTGGTAC TGTTTATACT TATTTGTAAT GAACTATTAT TAAAACACTC 

-2481 -2461 -2441 -2421 -2401 

TGAACAAACT Tl T TTTTl ' lT CGAAACCAAA CCAAGCTGAA AAAAACTCAA CGATTTTCTT TGTTTAAAAT ACGTTAGAAA GGAATATGTA TTATGCCGAA 
ACTTGTTTGA AAAAAAAAAA GCTTTGGTTT GGTTCGACTT TTTTTGAGTT GCTAAAAGAA ACAAATTTTA TGCAATCTTT CCTTATACAT AATACGGCTT 

-2381 -2361 -2341 -2321 -2301 

ATAAGTAATA TCGATCAGGC CACCTCTCTT ATAGTTATTC TCCTAGCAAC TTTAACCACT AGAAGGTTTT GTTTTCTAGT GTTTTCTAAT ATACGTCATC 
TATTCATTAT AGCTAGTCCG GTGGAGAGAA TATCAATAAG AGGATCGTTG AAATTGGTGA TCTTCCAAAA CAAAAGATCA CAAAAGATTA TATGCAGTAG 

-2281 -2261 -2241 -2221 -2201 

AAAATTTTCA AAAAATACTA CATTTTTGTT TTAAAAACTT C CATAATTCC ATTACTCGTA GAACACAAAC GCAAACCATA TTAATATTTT GTTGTCAACA 
TTTTAAAAGT TTTTTATGAT GTAAAAACAA AATTTTTGAA GGTATTAAGG TAATGAGCAT CTTGTGTTTG CGTTTGGTAT AATTATAAAA CAACAGTTGT 

-2181 -2161 -2141 -2121 -2101 

AAAATTTCAA ATTATAATTC AACTATATTT GCTTGATTAC CCAATTAGAT AGAAAAGAGT TAAAGAAGAA AAGAAAAGAG TTTACAGTAA ATTAACGCAA 
TTTTAAAGTT TAATATTAAG TTGATATAAA CGAACTAATG GGTTAATCTA TCTTTTCTCA ATTTCTTCTT TTCTTTTCTC AAATGTCATT TAATTGCGTT 

-2081 -2061 -2041 -2021 -2001 

AC CATAATTA TATTTAACAC CGTATTAATC ACATCAACCA TATGACTTTT TTACCGTTTG CAACTTCATA ATTCATATAG TATCATAATA AATTCGCAAT 
TGGTATTAAT ATAAATTGTG GCATAATTAG TGTAGTTGGT ATACTGAAAA AATGGCAAAC GTTGAAGTAT TAAGTATATC ATAGTATTAT TTAAGCGTTA 

-1981 -1961 -1941 -1921 -1901 

AATACAACAC AAGAGTTTCG TCGGAAGAGT AAATAATACT CAAATAGGGG GTGAGTGATA CGAGCCACAT GTATTCTTGA AGGGTAGATT ATTGCAAACT 
TTATGTTGTG TTCTCAAAGC AGCCTTCTCA TTTATTATGA GTTTATCCCC CACTCACTAT GCTCGGTGTA CATAAGAACT TCCCATCTAA TAACGTTTGA 

-1881 -1861 -1841 -1821 -1801 

TGGAGTAATA AAGAGAAGAA GAATGGGTTT GTAGTAGTTG CGTGGAGTAT CTTTATTTGG GTAAAACTTT AATTTAGAAA TAAAATTCTG TACGGACAAT 
ACCTCATTAT TTCTCTTCTT CTTACCCAAA CATCATCAAC GCACCTCATA GAAATAAACC CATTTTGAAA TTAAATCTTT ATTTTAAGAC ATGCCTGTTA 

-1781 -1761 -1741 -1721 -1701 

GGATCGTGTC CCAATCAGAT TTCTTGTGGC TGCTTCGGGT CTGGTTTTGG GTCCCTTTGA AAAATTTTAG TGGTCGACAC TTTTTAT7TT ACTCTGGCTC 
CCTAGCACAG GGTTAGTCTA AAGAACACCG ACGAAGCCCA GACCAAAACC CAGGGAAACT TTTTAAAATC ACCAGCTGTG AAAAATAAAA TGAGACCGAG 

-1681 -1661 -1641 -1621 -1601 

GTGCCTCGAG GGTCCCTCTA TTCACTGTTT CTTCGTA TGA AGGTATGCTT AAA CATTA TT TTATTTTTAA AAACCCTTTA ATTTTATTTT CTTACCTTTA 
CACGGAGCTC CCAGGGAGAT AAGTGACAAA gaagcatact tccatacgaa tttgtaataa aataaaaatt TTTGGGAAAT TAAAATAAAA GAATGGAAAT 

^1581 -1561 -1541 -1521 -1501 

ATCACGGTTT TGTAAATTGC TTTTTAGTCT ATGGAATGAT GATTGTGGCG ATTGAAATCA TATGTTTGGT TCTGTTGTTG ACGTTGGTGA AGTATATGTG 
TAGTGCCAAA ACATTTAACG AAAAATCAGA TACCTTACTA CTAACACCGC TAACTTTAGT ATACAAACCA AGACAACAAC TGCAACCACT TCATATACAC 

-14B1 -1461 -1441 -1421 -1401 

ATTTGTAATG TTGAGCTTAT GTATTAAAAT GTTAAATGAT AAATAACCTC GTAAGAAAGT GATTTCATTT AAATTTTATT TTGAGTTACA TATTCAATTG 
TAAACATTAC AACTCGAATA CATAATTTTA CAATTTACTA TTTATTGGAG CATTCTTTCA CTAAAGTAAA TTTAAAATAA AACTCAATGT ATAAGTTAAC 

-1381 -1361 -1341 -1321 -1301 

GTTTTATAAA AAAATACTTC AGTGATGATT GATACCCCCA TTGTGTGTGT AATTGTTACT GGGATTGAAC AAAATTTATT TGTGCATGAC AAACTTTCCA 
CAAAATATTT TTTTATGAAG TCACTACTAA CTATGGGGGT AACACACACA TTAACAATGA CCCTAACTTG TTTTAAATAA ACACGTACTG TTTGAAAGGT 

-1281 -1261 -1241 -1221 -1201 

AATTAGTGCA TAGATTGTAA TTGTATAATG GACTACATGT ATCTGAGTAG ATATGGTTCA TTAGGTTACA AACCTCTTTT TTTAAGGACA CAATTTTTCG 
TTAATCACGT ATCTAACATT AACATATTAC CTGATGTACA TAGACTCATC TATACCAAGT AATCCAATGT TTGGAGAAAA AAATTCCTGT GTTAAAAAGC 

-11B1 -1161 -1141 -1121 -1101 

ACAAGTTATA TGCCACATGA TTGACTACTA AATTTTCAAA AATTATTGCA CTAATGTCTT TGAAATTAAC AAATTATTTT GTCATTTCCG AGTTGGATTC 
TGTTCAATAT ACGGTGTACT AACTGATGAT TTAAAAGTTT TTAATAACGT GATTACAGAA ACTTTAATTG TTTAATAAAA CAGTAAAGGC TCAACCTAAG 

-1031 -1061 -1041 -1021 -1001 

TTACAAACCA AGGCCGAACT CACAAACTTA TTTCTTTCAG TAAAAACAAA ACATTGTCCT CAGAAAAATT CTGAAATGTC ATCTTCCCAA ATGTTTTTAC 
AATGTTTGGT TCCGGCTTGA GTGTTTGAAT AAAGAAAGTC ATTTTTGTTT TGTAACAGGA GTCTTTTTAA GACTTTACAG TAGAAGGGTT TACAAAAATG 

-981 -961 -941 -921 -901 

ATAAATAAAA ATAATATACA GTTGATATTA TTTTGTTCTT TCTGAATTTT GTTATGAGGT ACCATTACCA TATAGTACGT AGATTTACAA AAATGAAAAT 
TATTTATTTT TATTATATGT CAACTATAAT AAAACAAGAA AGACTTAAAA CAATACTCCA TGGTAATGGT ATATCATGCA TCTAAATGTT TTTACTTTTA 

-881 -861 -841 -821 -801 

ACGTTGTAGC CCTTGATGTT CTTCAGGTCT TCTAGTTAGT TTTTGCAGTA AATACCAACC AATTAGTTAC AAGGAGTATA AGTGAACAAA GTGAGACAAC 
TGCAACATCG GGAACTACAA GAAGTCCAGA AGATCAATCA AAAACGTCAT TTATGGTTGG TTAATCAATG TTCCTCATAT TCACTTGTTT CACTCTGTTG 

-781 -761 -741 -721 -701 

TCATTTTATG CTTCCCTATA AAAAGAAATT CCCCACTGAC CCAAACACAC ACTTCTCTTC TCTCTCTCAT CTCATTGGAG ACTTATAAAT CCTATTACCT 
AGTAAAATAC GAAGGGATAT TTTTCTTTAA GGGGTGACTG GGTTTGTGTG TGAAGAGAAG AGAGAGAGTA GAGTAACCTC TGAATATTTA GGATAATGGA 

-681 -661 -641 -621 -G01 

CACCATATCC AATAACCACC ACACACAGAC CAATATCCAA AAAAAAAACT AAAACTAAAA ATATAATATA TATCGTTTTC TTTCCAAAAA TAATCATTTA 
GTGGTATAGG TTATTGGTGG TGTGTGTCTG GTTATAGGTT TTTTTTTTGA TTTTGATTTT TATATTATAT ATAGCAAAAG AAAGGTTTTT ATTAGTAAAT 

-581 -561 -541 -S21 -501 

AGAAACCCCA TCATCTTGAT AGTATTATAA AATTAATAAA CCTCTCCCTG AAAATATCTC ATCCTTCACC AATCAAAACC TTCTCATGTC TTCTTCTCTC 
TCTTTGGGGT AGTAGAACTA TCATAATATT TTAATTATTT GGAGAGGGAC TTTTATAGAG TAGGAAGTGG TTAGTTTTGG AAGAGTACAG AAGAAGAGAG 

-481 -461 -441 -421 -401 

CTCGACCTTT GAGGTGGAAA ATT AAA TATA TTCCCTTAGC TTTTTTTCTC CTTTAGTTTT CTTCTTCTTC TTGAGTTTTT TTTCTTTTGA TCCTCTCTAA 
GAGCTGGAAA CTCCACCTTT TAATTTATAT AAGGGAATCG AAAAAAAGAG GAAATCAAAA GAAGAAGAAG AACTCAAAAA AAAGAAAACT AGGAGAGATT 

-381 -361 -341 -321 -3D1 

TTTCCTTGTT GATTCATCGA CTAGATCTAA TTCTTCTCAC AAAAGACTGA GTGTGTTCTT TCTTTCAAAT CTTTCAAAAA CTAGGGTTTT TACTGTCTTG 
AAAGGAACAA CTAAGTAGCT GATCTAGATT AAGAAGAGTG TTTTCTGACT CACACAAGAA AGAAAGTTTA GAAAGTTTTT GATCCCAAAA ATGACAGAAC 
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AAATCATATT TATTCTTCTA AATTTAGCAA AAAGAACACG ATTTACTTTC CATTTCAGTC GTCTTGTCAC TCTCTCTCTC TTCTTTAAAG TCTCCCTTTT 
TTTAGTATAA ATAAGAAGAT TTAAATCGTT TTTCTTGTGC TAAATGAAAG GTAAAGTCAG CAGAACAGTG AGAGAGAGAG AAGAAATTTC AGAGGGAAAA 

-181 -162 -141 -121 -101 

TAGCAAAAAT TCTCTCTCTC ACAAAATTTA TTTCCTCTGG CTTCTTCTTC CTCCTCCTCC ATCTCTTCTC TTTACTCTCT CTTTAATCAT CTCTCATTCT 
ATCGTTTTTA AGAGAGAGAG TGTTTTAAAT AAAGGAGACC GAAGAAGAAG GAGGAGGAGG TAGAGAAGAG AAATGAGAGA GAAATTAGTA GAGAGTAAGA 

-81 -61 -41 -21 -1 

TGAATCTTGA TCCATCAAAA TCAATCCCGT TCTCGAAAGA TCCATTAAAA TCAAAACCTA AGCTCTCTCT CTTGCTTCTA GGGTTTTTTT GTTCGTTGTG 
ACTTAGAACT AGGTAGTTTT AGTTAGGGCA AGAGCTTTCT AGGTAATTTT AGTTTTGGAT TCGAGAGAGA GAACGAAGAT CCCAAAAAAA CAAGCAACAC 

20 40 60 80 100 

ATGGCGAGAG AAAAGATTCA GATCAGGAAG ATCGACAACG CAACGGCGAG ACAAGTGACG TTTTCGAAAC GAAGAAGAGG GCTTTTCAAG AAAGCTGAAG 
TACCGCTCTC TTTTCTAAGT CTAGTCCTTC TAGCTGTTGC GTTGCCGCTC TGTTCACTGC AAAAGCTTTG CTTCTTCTCC CGAAAAGTTC TTTCGACTTC 

120 140 160 180 200 

AACTCTCCGT TCTCTGCGAC GCCGATGTCG CTCTCATCAT CTTCTCTTCC ACCGGAAAAC TGTTCGAGTT CTGTAGCTCC AGGTCTTTCT TTCTCTCTCT 
TTGAGAGGCA AGAGACGCTG CGGCTACAGC GAGAGTAGTA GAAGAGAAGG TGGCCTTTTG ACAAGCTCAA GACATCGAGG TCCAGAAAGA AAGAGAGAGA 

220 240 260 280 300 

AACTTCCCTC TCTATAGATT TCTCATAACT CATCGAAGGA ATCTTGTCTA GATCCAGACA AAAAACTTTA AAGAGTTTTT AGATGTATAT CTGATACATA 
TTGAAGGGAG AGATATCTAA AGAGTATTGA GTAGCTTCCT TAGAACAGAT CTAGGTCTGT TTTTTGAAAT TTCTCAAAAA TCTACATATA GACTATGTAT 

320 340 360 380 400 

GGAGTTTACT GTATCAATCT TTATAGGACC ACTAACTATT TATATAATTA AAATAGTTGT TAGAAACATT AATCATGACC ATAAATGACA TATATAAAGT 
CCTCAAATGA CATAGTTAGA AATA TCCTGG TGATTGATAA ATATATTAAT TTTATCAACA ATCTTTGTAA TTACTACTGG TATTTACTGT ATATATTTCA 

420 440 460 480 500 

GTATAGTAAA ACTCTGTATT TAGATAAATT AAGGTATCTA ACTACGGTAA TATTCAAAAA GATGTAAATC TGGATATGCA TATATGTATA TTATTAGTAT 
CATATCATTT TGAGACATAA ATCTATTTAA TTCCATAGAT TGATGCCATT ATAAGTTTTT CTACATTTAG ACCTATACGT ATATACATAT AATAATCATA 

520 540 S60 580 600 

ATAAATACAT GCTCTATAGT AGGTATTTGT GTCAACCATG TATAAATCTA TGTATATAGA TATTGTGGTA TGATATGTTT AAGCCGTCAA TGTCATATTT 
TATTTATGTA CGAGATATCA TCCATAAACA CAGTTGGTAC ATATTTAGAT ACATATATCT ATAACACCAT ACTATACAAA TTCGGCAGTT ACAGTATAAA 

620 640 €60 680 700 

ATATAGAAAT ATGTGGGTAC CATAACATGA GGAAGTATCT ATATGTGTGG ATGTATAAAG CTTTCCCTTT GAAGAAGTAA TCTAAAAATA ATATATATAT 
TATATCTTTA TACACCCATG GTATTGTACT CCTTCATAGA TATACACACC TACATATTTC GAAAGGGAAA CTTCTTCATT AGATTTTTAT TATATATATA 

720 740 760 780 800 

ATATATGTAT ATGTATAGAT ATGTTGGAAT CTTTATTAGT GTTGGGAAAA GTCATTTAGA GAGATATTAT TGATATTAGG GATCTAAAAT GACTTATCGT 
TATATACATA TACATATCTA TACAACCTTA GAAATAATCA CAACCCTTTT CAGTAAATCT CTCTATAATA ACTATAATCC CTAGATTTTA CTGAATAGCA 

820 840 860 880 900 

A TTACAGAGA TACQATTTTG GATTTTTGAC CCACTAGTTA TCAGCTCAGT TCCTATCTTC GGGGACATAC ACACTTTCAC AGATAATTGT GTATATATGT 
TAATGTCTCT ATGCTAAAAC CTAAAAACTG GGTGATCAAT AGTCGAGTCA AGGATAGAAG CCCCTGTATG TGTGAAAGTG TCTATTAACA CATATATACA 

920 940 960 980 1000 

AACTGAAAAC GATAGTGTTA ACATGAAATA ATGTACATGT TTGGGATTAA ATGTGTTTTG TGGATTTGGT TTGCATCTTT TGATTTTAGA TTTTGGTATA 
TTGACTTTTG CTATCACAAT TGTACTTTAT TACATGTACA AACCCTAATT TACACAAAAC ACCTAAACCA AACGTAGAAA ACTAAAATCT AAAACCATAT 

1020 1040 1060 1080 1100 

TTGTCGGTGT TTACATATGC ACATTGTTAA TATCAACAGT ATAGTTGTTT ATAATAAGTT ATTTATTGGA ATGTGTTTAT ATTATGAAGC ATGAAGGAAG 
AACAGCCACA AATGTATACG TGTAACAATT ATAGTTGTCA TATCAACAAA TATTATTCAA TAAATAACCT TACACAAATA TAATACTTCG TACTTCCTTC 

1120 1140 1160 1180 1200 

TCCTAGAGAG GCATAACTTG CAGTCAAAGA ACTTGGAGAA GCTTGATCAG CCATCTCTTG AGTTACAGGT TAGCTACATT CTCGAAACGA C C AC ACATTT 
AGGATCTCTC CGTATTGAAC GTCAGTTTCT TGAACCTCTT CGAACTAGTC GGTAGAGAAC TCAATGTCCA ATCGATGTAA GAGCTTTGCT GGTGTGTAAA 

1220 1240 1260 1280 1300 

TCTTTCCCGA TTTCTGTAAC TTGCAAAA TC GAGTATTACT CCGTTGAATT ACCAATATGT TTTAGATTGT TGTATTTATT GACCAAGAAT CTCTTAAAAC 
AGAAAGGGCT AAAGACATTG AACGTTTTAG CTCATAATGA GGCAACTTAA TGGTTATACA AAATCTAACA ACATAAATAA CTGGTTCTTA GAGAATTTTG 

1320 1340 1360 1380 1400 

TTTGTATTAA TAGGTACAAA ACTTTATATT ATTGCATATG ATTAATTAGA CTCGATCCAT GTAGTAGTCA TGTAGAGTAG TCCTGTGTAG AGAGTTGAGC 
AAACATAATT ATCCATGTTT TGAAATATAA TAACGTATAC TAATTAATCT GAGCTAGGTA CATCATCAGT ACATCTCATC AGGACACATC TCTCAACTCG 

1420 1440 1460 1480 1500 

TTTAGATCAT TATGGATATG ATTAAGAGCT TAAATCAATG TTTTATTCTG TTAGCTGGTT GAGAACAGTG ATCACGCCCG AATGAGTAAA GAAATTGCGG 
AAATCTAGTA ATACCTATAC TAATTCTCGA ATTTAGTTAC AAAATAAGAC AATCGACCAA CTCTTGTCAC TAGTGCGGGC TTACTCATTT CTTTAACGCC 

1520 1540 1560 1580 1600 

ACAAGAGCCA CCGACTAAGG TACGTTATAT ATGTATATTC TATGACTTTT GAACTAACTA TCATTTTCTA ACTAATTTTT TTTTTGATCA ACCACTATCA 
TGTTCTCGGT GGCTGATTCC ATGCAATATA TACATATAAG ATACTGAAAA CTTGATTGAT AGTAAAAGAT TGATTAAAAA AAAAACTAGT TGGTGATAGT 

1620 1640 1660 1680 1700 

TTTTCTAACT GTGTGTTTAC ATGATCATAT ATAGGCAAAT GAGAGGAGAG GAACTTCAAG GACTTGACAT TGAAGAGCTT CAGCAGCTAG AGAAGGCCCT 
AAAAGATTGA CACACAAATG TACTAGTATA TATCCGTTTA CTCTCCTCTC CTTGAAGTTC CTGAACTGTA ACTTCTCGAA GTCGTCGATC TCTTCCGGGA 

1720 1740 1760 1780 1300 

TGAAACTGGT TTGACGCGTG TGATTGAAAC AAAGGTTGTT AAGAAAATTA CTTGATACCA TGTATAAGTT TCTCTAAGCT TACGAGTATG CAATTTACTA 
ACTTTGACCA AACTGCGCAC ACTAACTTTG TTTCCAACAA TTCTTTTAAT GAACTATGGT ACATATTCAA AGAGATTCGA ATGCTCATAC GTTAAATGAT 

1B20 1840 I860 1880 1900 

ATACGAGATG TGTTTGCAGA GTGACAAGAT TATGAGTGAG ATCAGCGAAC TTCAGAAAAA GGTAATAATT AACCAAAATA ACGTTTATTC TTTACTTGAT 
TATGCTCTAC ACAAACGTCT CACTGTTCTA ATACTCACTC TAGTCGCTTG AAGTCTTTTT CCATTATTAA TTGGTTTTAT TGCAAATAAG AAATGAACTA 

1920 1940 1960 1980 2000 

GATTTCAATA TTAATTTTGG CAGTTTCAAG ATCCAAAATT TTCATCTTCT TCTCTTTTTT TTTGGTGTTC AGGGAATGCA ATTGATGGAT GAGAACAAGC 
CTAAAGTTAT AATTAAAACC GTCAAAGTTC TAGGTTTTAA AAGTAGAAGA AGAGAAAAAA AAACCACAAG TCCCTTACGT TAACTACCTA CTCTTGTTCG 

2020 2040 2060 20S0 2100 

GGTTGAGGCA GCAAGTATGT GTCTTACCCT CTCTGTTGAT AACAAATCCC TTTCTTTTGT CTACCATTAA CGTACACACC CCTAAATTTA ATCCCCAGTT 
CCAACTCCGT CGTTCATACA CAGAATGGGA GAGACAACTA TTGTTTAGGG AAAGAAAACA GATGGTAATT GCATGTGTGG GGATTTAAAT TAGGGGTCAA 

2120 

GTCTACAACA CATATGTTTG ATCATACTGT GAGA 
CAGATGTTGT GTATACAAAC TAGTATGACA CTCT 
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SEQ ID NO: 46 

Arabidopsis AGL24 genomic sequence 

.2981 -2961 -2941 -2921 -2901 

AGACTTACAA TAACTTCATC AAGCAACTCA TACACGAGCA CAAAGTTTTT CCTGAATGAA TCTTCATTCA GAACACCAAG ATAATCCTTA ATAACACGAG 
TCTGAATGTT ATTGAAGTAG TTCGTTGAGT ATGTGCTCGT GTTTCAAAAA GGACTTACTT AGAAGTAAGT CTTGTGGTTC TATTAGGAAT TATTGTGCTC 

-2881 -2861 -2841 -2821 -2801 

CAATCCTTTG TAGAAGCTCC AAAACAAGAG AGGGTGACAC GTTAACTCTC GTTGTCGCAA CAAAATATAG ACCAACAACC TTGACATGGA AGTAGTTCAC 
GTTAGGAAAC ATCTTCGAGG TTTTGTTCTC TCCCACTGTG CAATTGAGAG CAACAGCGTT GTTTTATATC TGGTTGTTGG AACTGTACCT TCATCAAGTG 

-2781 -2761 ^2741 -2721 -2701 

GCCATCGACA TTCTATAAGC ACAAAAAATA AGTTAGATGA AATCATTACA GCTCACAACC AAACAGAAAG TATAATACCT ACAAAGATAG GTGGCGCCTC 
CGGTAGCTGT AAGATATTCG TGTTTTTTAT TCAATCTACT TTAGTAATGT CGAGTGTTGG nTGTCTTTC ATATTATGGA TGTTTCTATC CACCGCGGAG 

-2681 -2661 -2641 -2621 -2601 

TGCATTGCCA TCCTCCTTCC AGAACTTGAC TTTACGGAAG AATGTCTCTG TACTTCCTTT GGGTACCTCA GCCCGGTCTG TAGCAATAAA ACGTTACACA 
ACGTAACGGT AGGAGGAAGG TCTTGAACTG AAATGCCTTC TTACAGAGAC ATGAAGGAAA CCCATGGAGT CGGGCCAGAC ATCGTTATTT TGCAATGTGT 

-2581 -2561 -2S41 -2521 -2501 

TCTTGAAACT TGTATTGGAT CCAACCAAAT CGTATAATCT CAAAACAAAT AGCTTTCTTC TACTACATTA CATACAGATA CTCTGCCCAA ACTAATTGAA 
AGAACTTTGA ACATAACCTA GGTTGGTTTA GCATATTAGA GTTTTGnTA TCGAAAGAAG ATGATGTAAT GTATGTCTAT GAGACGGGTT TGATTAACTT 

-2481 -2461 -2441 -2421 -2401 

TAGTTTTGCT ATATTTGTAC AATCTGATTT GGAAATTCAG CTCAACATAA TTTGTCATCG GATAAGAAAT GTTGGTAGAT CAAACAGATC AATGAGCTTA 
ATCAAAACGA TATAAACATG TTAGACTAAA CCTTTAAGTC GAGTTGTATT AAACAGTAGC CTATTCTTTA CAACCATCTA GTTTGTCTAG TTACTCGAAT 

-2381 -2361 -2341 -2321 -2301 

GAGAAGATTT CAATGGAAAA TTCTCATGAA ACAGTGACAT AAGACTCGAC TCTGAAGAGA AAAAGCAAAA CAGGAAGAAG CAGAGAGGAT CAGATCGAGA 
CTCTTCTAAA GTTACCTTTT AAGAGTACTT TGTCACTGTA TTCTGAGCTG AGACTTCTCT nTTCGTTTT GTCCTTCTTC GTCTCTCCTA GTCTAGCTCT 

-2281 -2261 -2241 -2221 -2201 

AAGAGAGCTT ACAGTCACGG AAGACGATGT TATCTCCTCG CTGAGATAAG ACGAAGAATT GGGAGATCAT CATCGTTCCT TATAGCGGTG GATTC CGACT 
TTCTCTCGAA TGTCAGTGCC TTCTGCTACA ATAGAGGAGC GACTCTATTC TGCTTCTTAA CCCTCTAGTA GTAGCAAGGA ATATCGCCAC CTAAGGCTGA 

-2181 -2161 -2141 -2121 -2101 

GTTTCACCGC GAGTTTGGTT AAGTCTACTG ATCGCCGATC GGTCTCGTCT TTTTGTGTGT CTGGTGGTGA GGTGGTTCAC GTTTTACCAT TTGC CGTCGT 
CAAAGTGGCG CTCAAACCAA TTCAGATGAC TAGCGGCTAG CCAGAGCAGA AAAACACACA GACCACCACT CCAC CAAGTG CAAAATGGTA AACGGCAGCA 

-2081 -2061 -2041 -2021 -2001 

TATCGTGAAG CTTCTTCATG AGACGGAGGG TTCTGTGTTT TTGTGAATTA TGATTTCTTG TTCTTATATG GGCCTATTTT TAAGACATCA ATATGGCCCA 
ATAGCACTTC GAAGAAGTAC TCTGCCTCCC AAGACACAAA AACACTTAAT ACTAAAGAAC AAGAATATAC CCGGATAAAA ATTCTGTAGT TATACCGGGT 

-1981 -1961 -1941 "1921 -1901 

AATTTCGAAC TTGTTATGAG TTTAAGGAAA TAAGTAGTAA GTACTATAAA TGATGGTTCG ATCTCGGAGG AGAAAAAAAA AAACATTGTT TACGAGGAAG 
TTAAAGCTTG AACAATACTC AAATTCCTTT ATTCATCATT CATGATATTT ACTACCAAGC TAGAGCCTCC TCTTTTTTTT TTTGTAACAA ATGCTCCTTC 

-1881 -1861 -1841 -1821 -1B01 

CAAAATGTGA GTTGATATAA AGGGTACAAC ACATAATTTA TTTTTGGAAG TCAAAACTTT GAGGATTAAG CTGACAACGA AGGTTAGTGA AGACTTTCGG 
GTTTTACACT CAACTATATT TCCCATGTTG TGTATTAAAT AAAAACCTTC AGTTTTGAAA CTCCTAATTC GACTGTTGCT TCCAATCACT TCTGAAAGCC 

-1781 -1761 -1741 -1721 -1701 

GATCGAGCAA TCGGGAGATA TACATGAGCC TAGAGGGCTG ACAAGATGAC CAAGCATTCC AAATGAAAGG CTTAAGATTT TTCTTTTTCT AAACTCAAGT 
CTAGCTCGTT AGCCCTCTAT ATGTACTCGG ATCTCCCGAC TGTTCTACTG GTTCGTAAGG TTTACTTTCC GAATTCTAAA AAGAAAAAGA TTTGAGTTCA 

-1681 -1661 -1641 -1621 -1601 

AAGAAACACA AGATATATGA AAGGGTAACA AGGGTCAACA ACAAGTCTAA GCTTTTTAAA CGTGTTAGAT GATTCTTCTT GAACACTATT ACAATTACTG 
TrCTTTGTGT TCTATATACT TTCCCATTGT TCCCAGTTGT TGTTCAGATT CGAAAAATTT GCACAATCTA CTAAGAAGAA CTTGTGATAA TGTTAATGAC 

-15B1 -1561 -1S41 -1521 -1501 

TTTAGTTTCA CATTTATATG ACCTTGGGAG TCTTCTAGCT CGTCCCAAAT ATATTTTCAA CATATTACTA TAAGATCCTA AAGACCAATA ACATTGATCT 
AAATCAAAGT GTAAATATAC TGGAACCCTC AGAAGATCGA GCAGGGTTTA TATAAAAGTT GTATAATGAT ATTCTAGGAT TTCTGGTTAT TGTAACTAGA 

-1481 -1461 -1441 -1421 -1401 

ACACCAAAAA CTCTCACTTT CTGATTTTGC ACTCGCTTTT TTTCCTCCCA TAAACAAAAC CAAAGGCTTA CAA TACT AAA TCTGTCTCAC ATTCTTAGTG 
TGTGGTTTTT GAGAGTGAAA GACTAAAACG TGAGCGAAAA AAAGGAGGGT ATTTGTTTTG GTTTCCGAAT GTTATGATTT AGACAGAGTG TAAGAATCAC 

-1381 -1361 -1341 -1321 -1301 

CTTATTTGTT TTAGTCATAA AGAACTTAAT CTTATACAGA TTGAAGTCTT AAAGTCATCT ATATTACTTT TCACATGTAT CATTATGAGA TGGTACGTTT 
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GAATAAACAA AATCAGTATT TCTTGAATTA GAATATGTCT AACTTCAGAA TTTCAGTAGA TATAATGAAA AGTGTACATA GTAATACTCT ACCATGCAAA 

-1281 -12S1 -1241 -1221 -1201 

CCCACGAATT TTATCAGTTT AGTTTAATTT TCAGTTGTAC TTTGGGAGAA AAAATTTACA AGATACTTGT CGGCCATGAT ATCACCCTAG AGTTACCGGA 
GGGTGCTTAA AATAGTCAAA TCAAATTAAA AGTCAACATG AAACCCTCTT TTTTAAATGT TCTATGAACA GCCGGTACTA TAGTGGGATC TCAATGGCCT 

-1181 -1161 -1141 -1121 -1101 

GTCCGGTGAT ATATCATTTC TAATTAGGGT TAAAACTTAA AAGGGTATAA ATGGCTGATC AAACCCAAAA ATAAAAGATA ATGATGACGG TGGGAGACGA 
CAGGCCACTA TATAGTAAAG ATTAATC CCA ATTTTGAATT TTCCCATATT TAC CGACTAG TTTGGGTTTT TATTTTCTAT TACTACTGCC ACCCTCTGCT 

-1081 -1061 -1041 -1021 -1001 

GTGATCTTAT CAGGTGTCGC ATCTAGCATA TATAGGTGAA AGACTATAAA AAAGACATGA AATATTTAAT AGACACAACT TTTGTAATAA ACCAAAACCA 
CACTAGAATA GTCCACAGCG TAGATCGTAT ATATCCACTT TCTGATATTT TTTCTGTACT TTATAAATTA TCTGTGTTGA AAACATTATT TGGTTTTGGT 

-981 -961 -941 -921 -901 

AAAAGGTAGA TGAACTGATG AACAGCATCT TCTAATTACG AATAAAAAAA GTAACCAAAC TTTCTTTCCA TTAGAATTGG TACGTAGTTC CTTGTGTATT 
TTTTCCATCT ACTTGACTAC TTGTCGTAGA AGATTAATGC TTATTTTTTT CATTGGTTTG AAAGAAAGGT AATCTTAACC ATGCATCAAG GAACACATAA 

-881 -861 -841 -821 -801 

GTGATTTCTT TCATTTTCCA ATTATGTTTT TTTATTTTAT CATGTTACAT TTTTGATAGT GGGTAACTTT TGTATCATTT TATTTGACCT AGCCATATAT 
CACTAAAGAA AGTAAAAGGT TAATACAAAA AAATAAAATA GTACAATGTA AAAACTATCA CCCATTGAAA ACATAGTAAA ATAAACTGGA TCGGTATATA 

-781 -761 -741 -721 -701 

AAATCTATTA ACTTATACGG AGTAGTATTT CACGTCATTT ATTTTTATTT TGTTTTTAGA TGGGAAGTTA TTCAAAACTA GACTAAAACA GTAAAACTAG 
TTTAGATAAT TGAATATGCC TCATCATAAA GTGCAGTAAA TAAAAATAAA ACAAAAATCT ACCCTTCAAT AAGTTTTGAT CTGATTTTGT CATTTTGATC 

-681 -661 -641 -621 -601 

GAAACCCGCT ACTGAATAAA GTTACAATTC CACATTATTC CATGACAGAC TAATTGAATT AGAAGGTTAG GTAAATTATT AAATCATAAC TGTAGCAGTC 
CTTTGGGCGA TGACTTATTT CAATGTTAAG GTGTAATAAG GTACTGTCTG ATTAACTTAA TCTTCCAATC CATTTAATAA TTTAGTATTG ACATCGTCAG 

-581 -561 -541 -521 -501 

TCTTCGTCTG GCAGCTCAGT CAGACAAAAC ACAAAGTGTG TTTATGTGTT ATTTTTAATG ATTATAGTTT GGGAAAAAGA CATAATCAAA AGGGATACAA 
AGAAGCAGAC CGTCGAGTCA GTCTGTTTTG TGTTTCACAC AAATACACAA TAAAAATTAC TAATATCAAA CCCTTTTTCT GTATTAGTTT TCCCTATGTT 

-481 -461 -441 -421 -401 

AACATATGGC CCATTGATAA GTATAGATCA CTGTTTAGCT AAAAAAAGCA GACTCTTTTT TCCAATCTTG AACACAAACA CAGTCACCAT CTCTCTCTCT 
TTGTATACCG GGTAACTATT CATATCTAGT GACAAATCGA TTTTTTTCGT CTGAGAAAAA AGGTTAGAAC TTGTGTTTGT GTCAGTGGTA GAGAGAGAGA 

-381 -361 -341 -321 -301 

CTTTCTCTCT CACTCACACA TTAGGGAGTA AACAGCTACC AGAAAAACCT TTTTTATCTT CTCACAAATT TAATAAAGTG GGTGCTGAGA TTGAATAACG 
GAAAGAGAGA GTGAGTGTGT AATCC CTCAT TTGTCGATGG TCTTTTTGGA AAAAATAGAA GAGTGTTTAA ATTATTTCAC CCACGACTCT AACTTATTGC 

-281 -261 -241 -221 -201 

TAATCCAAGA TCCTCCAACT CACAGAAAGG TAAAAGCTGT GAATCTGTGT TCTTTCTTCT TAAGCAAAGT GTTTGATGAA TTCATCTAGT CCTGTCCATT 
ATTAGGTTCT AGGAGGTTGA GTGTCTTTCC ATTTTCGACA CTTAGACACA AGAAAGAAGA ATTCGTTTCA CAAACTACTT AAGTAGATCA GGACAGGTAA 

-181 -161 -141 -121 -101 

CTTTTGCTTC TCATGGTTTA TGGATCTGAT CTCTCTTTCT CTCTCTCTCT AGCCATTAGG GTTTCCTAAG AATATTATAT AAACTCTCTT TAGCTAACAC 
GAAAACGAAG AGTACCAAAT ACCTAGACTA GAGAGAAAGA GAGAGAGAGA TCGGTAATCC CAAAGGATTC TTATAATATA TTTGAGAGAA ATCGATTGTG 

-81 -61 -41 -21 -1 

CGTTCCAATT GGTTTCTTTC TTTGTTCTTG GTCTAAAATC TAAATGGTGT TATGGGTATA GGCAGATTCA AGAACAGTAG TGAAGGAGAG ATCTGGTAAA 
GCAAGGTTAA CCAAAGAAAG AAACAAGAAC CAGATTTTAG ATTTACCACA ATACCCATAT CCGTCTAAGT TCTTGTCATC ACTTCCTCTC TAGACCATTT 

20 40 60 80 100 

ATGGCGAGAG AGAAGATAAG GATAAAGAAG ATTGATAACA TAACAGCGAG ACAAGTTACT TTCTCAAAGA GAAGAAGAGG AATCTTCAAG AAAGCCGATG 
TACCGCTCTC TCTTCTATTC CTATTTCTTC TAACTATTGT ATTGTCGCTC TGTTCAATGA AAGAGTTTCT CTTCTTCTCC TTAGAAGTTC TTTCGGCTAC 

120 140 160 180 200 

AACTTTCAGT TCTTTGCGAT GCTGATGTTG CTCTCATCAT CTTCTCTGCC ACCGGAAAGC TCTTCGAGTT CTCCAGCTCA AGGTATATTC TATCTTTTTG 
TTGAAAGTCA AGAAACGCTA CGACTACAAC GAGAGTAGTA GAAGAGACGG TGGCCTTTCG AGAAGCTCAA GAGGTCGAGT TCCATATAAG ATAGAAAAAC 

220 240 260 280 300 

TTAGTAGTTG TCTTATTTTT TTCAATCCAT GTTTGTGTTT TTGAGAATAT GGTTGGATAA ATATATTAAG ATATGTATTT AAATGAGATT TTTATTTTCT 
AATCATCAAC AGAATAAAAA AAGTTAGGTA CAAACACAAA AACTCTTATA CCAACCTATT TATATAATTC TATACATAAA TTTACTCTAA AAATAAAAGA 

320 340 360 380 400 

CGTTTACTCT CTAAAGTTAA TTATCAGTAG GCTCGGAGAT CTCATGTACG GCATAATTTG ATGAC CTAAA TTATTATACT TTAAAGTATA GGATTGATGT 
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GCAAATGAGA GATTTCAATT AATAGTCATC CGAGCCTCTA GAGTACATGC CGTATTAAAC TACTGGATTT AATAATATGA AATTTCATAT CCTAACTACA 



420 440 460 480 500 

TTTATTACTT TTATGTATAA CACATCATGT ATTTAATTCC GTTTAACATA ATATGGGTTT TTAACGTGTA ATTTTTCAAT CATTTTCATT TAGACTCATG 

AAATAATGAA AATACATATT GTGTAGTACA TAAATTAAGG CAAATTGTAT TATACCCAAA AATTGCACAT TAAAAAGTTA GTAAAAGTAA ATCTGAGTAC 

520 540 560 580 600 

GTTAAGATTT CTGTACTGGG AAATAAGAGA GCAGAATATT ATAGTGTGAT TTTTGTTAAT TAGGAAAGCA TATGTATATA TGGATACATA GTACTTACCA 

CAATTCTAAA GACATGACCC TTTATTCTCT CGTCTTATAA TATCACACTA AAAACAATTA ATCCTTTCGT ATACATATAT ACCTATGTAT CATGAATGGT 

620 640 660 680 700 

CAATTAGAAT GAATTTCTTT TCCCTTTTTT CATTTGACTT TGTGTATTAC AAAAGTCTTT GACACTGTCA CTTGGTATGA TTGGGGATTA ATTCTTAACC 

GTTAATCTTA CTTAAAGAAA AGGGAAAAAA GTAAACTGAA ACACATAATG TTTTCAGAAA CTGTGACAGT GAACCATACT AACCCCTAAT TAAGAATTGG 

720 740 760 780 800 

ACTCGTTTAG TTTATCTTGG GAAGCATTAC CATAATTGGG AAACGAGTCA TCTGTCTGTA TCGTGATGGC TACTTCTGAT TACTTTTCTT TTATTATAAC 

TGAGCAAATC AAATAGAACC CTTCGTAATG GTATTAACCC TTTGCTCAGT AGACAGACAT AGCACTACCG ATGAAGACTA ATGAAAAGAA AATAATATTG 

820 840 860 880 900 

CAAAAAGGCT TCTAATGTAC TTAATTAATT TTACAAATGT AATATGGACG AAGGAAATGT TTATAAGAAA GATGGATTGT TTGTTGAAAC GTGTAGAATG 

GTTTTTCCGA AGATTACATG AATTAATTAA AATGTTTACA TTATACCTGC TTCCTTTACA AATATTCTTT CTACCTAACA AACAACTTTG CACATCTTAC 

920 940 960 980 1000 

AGAGACATAT TGGGAAGGTA TAGTCTTCAT GCAAGTAACA TCAACAAATT GATGGATCCA CCTTCTACTC ATCTCCGGGT ATTTTCGATA TCACTTACTC 

TCTCTGTATA ACCCTTCCAT ATCAGAAGTA CGTTCATTGT AGTTGTTTAA CTACCTAGGT GGAAGATGAG TAGAGGCCCA TAAAAGCTAT AGTGAATGAG 

1020 1040 1060 1080 1100 

' 1 TTTTTTTTT TTGTGGATTT TAAACTCTCT GCTCTTTTTA CCAAACCCTT CTCTTTTTAT CAAACCCTTC TCTCTATAAT ATTATCCGAT GTTCACTTTG 

AAAAAAAAAA AACACCTAAA ATTTGAGAGA CGAGAAAAAT GGTTTGGGAA GAGAAAAATA GTTTGGGAAG AGAGATATTA TAATAGGCTA CAAGTGAAAC 

1120 1140 1160 1180 1200 

TTACACGTGT TTGTTATAAT TTTTAGCTGT AAGTCTAAAT ATAGAAACAT TGAGTGGCAT ATAATCATTA ATCTTGAAGC ATCTAATTAA TTGGTTTTAC 

AATGTGCACA AACAATATTA AAAATCGACA TTCAGATTTA TATCTTTGTA ACTCACCGTA TATTAGTAAT TAGAACTTCG TAGATTAATT AACCAAAATG 

1220 1240 1260 1280 1300 

ATATTAATAG CAGAATCCTG AAACTGTTGA CTTTGCATCT AGCAGCTTGA GAATTGTAAC CTCTCCAGAC TAAGTAAGGA AGTCGAAGAC AAAACCAAGC 

TATAATTATC GTCTTAGGAC TTTGACAACT GAAACGTAGA TCGTCGAACT CTTAACATTG GAGAGGTCTG ATTCATTCCT TCAGCTTCTG TTTTGGTTCG 

1320 1340 1360 1380 1400 

AGCTACGGTA TGGCTCCATT GATATGTTAT GCAGATAAAC CTATTTTCAT ATAGGCTATA GCTGTAAGAG ATCATCTATT TCATGTGTGT GGTTTTTTTT 

TCGATGC CAT ACCGAGGTAA CTATACAATA CGTCTATTTG GATAAAAGTA TATCCGATAT CGACATTCTC TAGTAGATAA AGTACACACA CCAAAAAAAA 

1420 1440 1460 1480 1500 

TTTATGTTTT TTCAATGATG TGTGCATGCT ATTTTTAGGT TTTAGAATCT ATTTCATGGA AATTGAAGAT ATTTCATTTC ACGTGTAAGT TCGTCAAGTT 

AAATACAAAA AAGTTACTAC ACACGTACGA TAAAAATCCA AAATCTTAGA TAAAGTACCT TTAACTTCTA TAAAGTAAAG TGCACATTCA AGCAGTTCAA 

1520 1540 1560 1580 1600 

GTGGCGTGTG TCTTGGAAAT TGATGTTTTG TTTGTAGATT TTAAGAGCTA CTTCTAAAAT TTACAAGAGT TTTGTAATTT TCAATTATGG CCCATTATTC 

CACCGCACAC AGAACCTTTA ACTACAAAAC AAACATCTAA AATTCTCGAT GAAGATTTTA AATGTTCTCA AAACATTAAA AGTTAATACC GGGTAATAAG 

1620 1640 1660 1680 1700 

TCATTAATTC ATTAAAAAAA TTATATACAT TACTATCTAT ATCTAGCATA GGTAGTTTTT TTTTTCTTTT TCTTTGGTAG ACCTACTGAA CAAATATCTG 

AGTAATTAAG TAATTTTTTT AATATATGTA ATGATAGATA TAGATCGTAT CCATCAAAAA AAAAAGAAAA AGAAACCATC TGGATGACTT GTTTATAGAC 

1720 1740 1760 1780 1800 

ATATATCACT GACTGGATAA ATATCTATAG AGATATTTTT GATAGAAATG AGTGTTAATT TAACGTAAAA CAGGAAACTG AGAGGAGAGG ATCTTGATGG 

TATATAGTGA CTGACCTATT TATAGATATC TCTATAAAAA CTATCTTTAC TCACAATTAA ATTGCATTTT GTCCTTTGAC TCTCCTCTCC TAGAACTACC 

1820 1840 1860 1880 1900 

ATTGAACTTA GAAGAGTTGC AGCGGCTGGA GAAACTACTT GAATC CGGAC TTAGCCGTGT GTCTGAAAAG AAGGTTTACT ACTATACATA AACTAATAGC 

TAACTTGAAT CTTCTCAACG TCGCCGACCT CTTTGATGAA CTTAGGCCTG AATCGGCACA CAGACTTTTC TTCCAAATGA TGATATGTAT TTGATTATCG 

1920 1940 1960 19B0 2000 

ATGCATATTT TCCTTAACGT GGCATATAAA TAATAAGCTG TACATATATA AAAGTTTGAC TTTGTTGTTG TTATTGGTAA ATAGGGCGAG TGTGTGATGA 

TACGTATAAA AGGAATTGCA CCGTATATTT ATTATTCGAC ATGTATATAT TTTCAAACTG AAACAACAAC AATAACCATT TATCCCGCTC ACACACTACT 

2020 2040 2060 2080 2100 

GCCAAATTTT CTCACTTGAG AAACGGGTTA GTAGTTAGTA CATACAATTC GTATAACTAA TGGATCATAA GCCTATCTAT AGCTAGTGAC TTTCTTAATA 
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CGGTTTAAAA GAGTGAACTC TTTGCCCAAT CATCAATCAT GTATGTTAAG CATATTGATT ACCTAGTATT CGGATAGATA TCGATCACTG AAAGAATTAT 

2120 2140 2160 2180 2200 

AGTGAAACAG GGATCGGAAT TGGTGGATGA GAATAAGAGA CTGAGGGATA AAGTACGGCT CTAAACCCTT ATAGATATCA TGGAATAACC TTAATCTATT 
TCACTTTGTC CCTAGCCTTA ACCACCTACT CTTATTCTCT GACTCCCTAT TTCATGCCGA GATTTGGGAA TATCTATAGT ACCTTATTGG AATTAGATAA 

2220 2240 2260 22S0 2300 

TTTTTATGTA TAAGAAAATA TGATGAGGGA ACGTATATTA TATATCGGCA GCTAGAGACG TTGGAAAGGG CAAAACTGAC GACGCTTAAA GAGGCTTTGG 
AAAAATACAT ATTCTTTTAT ACTACTCCCT TGCATATAAT ATATAGCCGT CCATCTCTGC AACCTTTCCC GTTTTGACTG CTGCGAATTT CTCCGAAACC 

2320 2340 2360 2380 2400 

AGACAGAGTC GGTGACCACA AATGTGTCAA GCTACGACAG TGGAACTCCC CTTGAGGATG ACTCCGACAC TTCCCTGAAG CTTGGGTATA ATTTGTTTAA 
TCTGTCTCAG CCACTGGTGT TTACACAGTT CGATGCTGTC ACCTTGAGGG GAACTCCTAC TGAGGCTGTG AAGGGACTTC GAACCCATAT TAAACAAATT 

2420 2440 2460 2430 

CTGAACATAT ttcaaacttt TTGTTGACAT TTTGTATGTG GATGTTTACT AACTGTTTGT tggttaggct tccatcttgg gaa 
GACTTGTATA AAGTTTGAAA AACAACTGTA AAACATACAC CTACAAATGA TTGACAAACA ACCAATCCGA AGGTAGAACC CTT 



SEQ ID NO: 47 

Arabidopsis AGL2 7 genomic sequence 

-2961 -2941 -2921 -2901 

CAACCAGCAG CACCAGCTGC AATCAAATCC TTTACGGTTC TTTGAATGTT TAGCGCATTT CCTCCACCGG TATCTTGAAA AGATCAAAAG AAACCTATGA 
GTTGGTCCTC GTGGTCCACG TTAGTTTAGG AAATGCCAAG AAACTTACAA ATCGCGTAAA GGAGGTGGCC ATAGAACTTT TCTAGTTTTC TTTGGATACT 

-2881 -2861 -2841 -2821 -2801 

AGAGAACTAT AACCAAGCAA ATCCACTATT TTCAAAAAGC TATGAAGAGA ACTATAAGCA AGCAAGCGAC TCTAACCAAG AAAGATTGAT ACTTTCAATC 
TCTCTTGATA TTGGTTCGTT TAGGTGATAA AAGTTTTTCG ATACTTCTCT TGATATTCGT TCGTTCGCTG AGATTGGTTC TTTCTAACTA TGAAAGTTAG 

-2781 -2761 -2741 -2721 -2701 

TTTGGTAAAG AATCAACGAC TCAATGTTTT TAAATGTTTT TTTTCCTTTT TTGGTTTTAG TTAAGCTTCT TGCATTCTTT AATGATGTCT TTATTATACT 
AAACCATTTC TTAGTTGCTG agttacaaaa ATTTACAAAA aaaaggaaaa aaccaaaatc AATTCGAAGA acgtaagaaa ttactacaga aataatatga 

-2681 -2661 -2641 -2621 -2601 

ATCAAAATTT TGCAACTTTA CCAGCATCTG CAATGATGGG TATATTAGGA GCTGACGCAC ACACCGACCT TGCCGTCGCA GCCATCTCCG GTGGTCTAAA 
TAGTTTTAAA ACGTTGAAAT GGTCGTAGAC GTTACTACCC ATATAATCCT CGACTGCGTG TGTGGCTGGA ACGGCAGCGT CGGTAGAGGC CACCAGATTT 

-2S81 -2561 -2541 -2521 -2501 

ACGACGAAAG AACACAAATA AAACGAAAGC ATACAAACAA AAAATTACTA AAGAAAGAAA AAAAAAAAGG TGGCGCACGT TAGCAAACCG AAATCGGGTT 
TGCTGCTTTC TTGTGTTTAT TTTGCTTTCG TATGTTTGTT TTTTAATGAT TTCTTTCTTT TTTTTTTTCC ACCGCGTGCA ATCGTTTGGC TTTAGCCCAA 

-2451 -24S1 -2441 -2421 -2401 

TTCCCAGGAG AGAAGCGGAT AAGGCGTAAC CGGATATAAA ACCAGCGGAG AATCCGGTTT GCTGCACAAT AGCCGCGGAT AAGGCATCGT AGCATCCAGG 
AAGGGTCCTC TCTTCGCCTA TTCCGCATTG GCCTATATTT TGGTCGCCTC TTAGGCCAAA CGACGTGTTA TCGGCGCCTA TTCCGTAGCA TCGTAGGTCC 

-2391 -2361 -2341 -2321 -2301 

CATAAGCACA ATGCCTTGTT CTTCAATCAG GCGATGAAAA CGTGTTTGGA TTCTCGCTGT CGGATTCACC AATCTCGCCG CGCGTGGGTT CCGTCGGAAT 
GTATTCGTGT TACGGAACAA GAAGTTAGTC CGCTACTTTT GCACAAACCT AAGAGCGACA GCCTAAGTGG TTAGAGCGGC GCGCACCCAA GGCAGCCTTA 

-22B1 -2261 -2241 -2221 "l 201 

GTTGGTGAAG CTGTAAGGTT TAAGCTGCTA CAACAGAGTG AAGTTGTTTT GACAGCCATT AACATCGACA TTCTTCGAAG CCTCGAACAA GTTTTTTCTT 
CAACCACTTC GACATTCCAA ATTCGACGAT GTTGTCTCAC TTCAACAAAA CTGTCGGTAA TTGTAGCTGT AAGAAGCTTC GGAGCTTGTT CAAAAAAGAA 

-2181 -2161 -2141 -2121 -2101 

CTCTCTAATC GAGTTAGACT CTGACCCACA CGCTTGGGAT TTTAATAGAG AGCACGTGGT TATTATATCT CGGTCTTATC TTATGGTAAC AGTATCTCAA 
GAGAGATTAG CTCAATCTGA GACTGGGTGT GCGAACC CTA AAATTATCTC TCGTGCACCA ATAATATAGA GCCAGAATAG AATACCATTG TCATAGAGTT 

-2081 -2061 -2041 -2021 -2001 

AGACTCAAAC CACAAGGTAT TGTGAAAATG TTAGAGGCAA TCTAACAATA AATGTATAAT TTGGTTAGCT TAAGCTCATC ATAGAAATGG GCCTTTATGT 
TCTGAGTTTG GTGTTCCATA ACACTTTTAC AATCTCCGTT AGATTGTTAT TTACATATTA AACCAATCGA ATTCGAGTAG TATCTTTACC CGGAAATACA 

-1981 -1961 -1941 -1921 -1901 

CACCAAACCT ATTTCACAAC ATAACACAAG AGCCCACAAA ACAACGACTC CTTTCTCCAC CAGAACAAGC ACGACAAAGG CAAGAGAGTT GCAAAAGACC 
GTGGTTTGGA TAAAGTGTTG TATTGTGTTC TCGGGTGTTT TGTTGCTGAG GAAAGAGGTG GTCTTGTTCG TGCTGTTTCC GTTCTCTCAA CGTTTTCTGG 

-1881 -1861 -1841 -1821 -1801 

TATAAGATGA TAACAATCGA AAAGATGTAA A' l ' l TTGAGAA AAATCAAAAT AAACAAGAAA GATTTCATTG TTTTTCACTT TTTCTCCATT TCTACTTTGA 
ATATTCTACT ATTGTTAGCT TTTCTACATT TAAAACTCTT TTTAGTTTTA TTTGTTCTTT CTAAAGTAAC AAAAAGTGAA AAAGAGGTAA AGATGAAACT 

-1781 -1761 -1741 -1721 -1701 

TTTTACATAC TCTATGGGCC AACCAATTTC CAACCTAATG tTTGATAAAA AATGATTCGG TTTTACTATC TCAACAAATT GGGCCTACAA CATCCAATTT 
AAAATGTATG AGATACCCGG TTGGTTAAAG GTTGGATTAC GAACTATTTT TTACTAAGCC AAAATGATAG AGTTGTTTAA CCCGGATGTT GTAGGTTAAA 

-1681 -1661 -1641 -1621 -1601 

CATGTAGTGA CTTGTTTTTG CCTTTTTCAC ATCTCAACAA ATTGGGTCGT TTGTATTTAA GAAATTGTTA CAGCTTTTTA GACTGAATTT TACTTTATGG 
GTACATCACT GAACAAAAAC GGAAAAAGTG TAGAGTTGTT TAACCCAGCA AACATAAATT CTTTAACAAT GTCGAAAAAT CTGACTTAAA ATGAAATACC 

-15B1 -1561 -1541 -1521 -1501 

CTTTATGCTC TCTTTTTCCG TTTTGATTAA GGGTGAATAT GTAAACTGTT GATACCATCT GATTTTTTTT ATTTTTTATT TTTCTTGTGT GCAACTATAC 
GAAATACGAG AGAAAAAGGC AAAACTAATT CCCACTTATA CATTTGACAA CTATGGTAGA CTAAAAAAAA TAAAAAATAA AAAGAACACA CGTTGATATG 

-1431 -1461 -1441 -1421 -1401 

CATCTGAATT CAATTGACAT TTTAGCCAAA TAAAAAAGAT TGGTCCACTT GGATGGCTGT AAAAAAGTTT AGTGGAAGTA TTTATAGGGC TTGTTGGCAA 
GTAGACTTAA GTTAACTGTA AAATCGGTTT ATTTTTTCTA ACCAGGTGAA CCTACCGACA TTTTTTCAAA TCACCTTCAT AAATATCCCG AACAACCGTT 

-1381 -1361 -1341 -1321 -1301 

TCTTCACCAA CGGCTATAAT GTTGATCTTT TTAAAATTAA A CTT AC CGTT CGACTGTCTT CTCAACGATT TGACAATTAG CCGTTAGATT AGTATTACTG 
AGAAGTGGTT GCCGATATTA CAACTAGAAA AATTTTAATT TGAATGGCAA GCTGACAGAA GAGTTGCTAA ACTGTTAATC GGCAATCTAA TCATAATGAC 

-1281 -1261 -1241 -1221 -1201 

ATTTATTATT AACAAACCCA TTTCTTTTCT TATTTTTGAA TAAGCTAAAT CAGGCCAATA AAAGGGACAA GTAGAGATGG GCTATTTCTT TTTTTCTCTT 
TAAATAATAA TTGTTTGGGT AAAGAAAAGA ATAAAAACTT ATTCGATTTA GTCCGGTTAT TTTCCCTGTT CATCTCTACC CGATAAAGAA AAAAAGAGAA 
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-1181 -1161 -H41 -H21 -HOI 

TTTTrrnrc ttatgtagta gagaaaagcc tttattctta gagctatcat ttaccaccca ttaaccagaa gctgagaaat gaagcaagcc gaaacgaatt 

AAAAAAAAAG AATACATCAT CTCTTTTCGG AAATAAGAAT CTCGATAGTA AATGGTGGGT AATTGGTCTT CGACTCTTTA CTTCGTTCGG CTTTGCTTAA 

-1081 -1061 -1041 -1021 -1001 

TGTAGTTTTG GACGGTGAAA TTATATCGGG CCTTTAATGG GCATGTGAAT AGAGTTGAGA GTCTTTTTGC CCCAAATAAT CGTTTAAGGG AGTATTGGCT 
ACATCAAAAC CTGCCACTTT AATATAGCCC GGAAATTACC CGTACACTTA TCTCAACTCT CAGAAAAACG GGGTTTATTA GCAAATTCC C TCATAAC CGA 

-981 -961 -941 -921 -901 

CGTTGGTTTA ATATTGGGCC GAAACGAGAT TGGGAAGAAG AACAATGTCG GTTTAATCCG GTTAGGGTCG TGGGCTGATT CTGGTTCACC TTTATAGCGT 
GCAACCAAAT TATAACCCGG CTTTGCTCTA ACCCTTCTTC TTGTTACAGC CAAATTAGGC CAATCCCAGC ACC CGACTAA GACCAAGTGG AAATATCGCA 

-8B1 -861 -841 -821 -801 

AAGCGAACAA ACATTGAAAA TGGGGAAGCC AAATTAGTTA CCATCCCTAA CTCAGTTTTG AGACGTAGTA TGAATGAGCC ACGGCAGAAC CTACGACCTA 
TTCGCTTGTT TGTAACTTTT ACCCCTTCGG TTTAATCAAT GGTAGGGATT GAGTCAAAAC TCTGCATCAT ACTTACTCGG TGCCGTCTTG GATGCTGGAT 

-781 -761 -741 -721 -701 

ACTCGATAAA GTAATGGTTA CTCTTGGAGA CGGAAGAAAG CACAAAGATT TTGATAAGGC TTTCTAGTTG GTGAAATGGT CAAAATCGCT CGGAGAGCCA 
TGAGCTATTT CATTACCAAT GAGAACCTCT GCCTTCTTTC GTGTTTCTAA AACTATTCCG AAAGATCAAC CACTTTACCA GTTTTAGCGA GCCTCTCGGT 

-681 -661 -641 -621 -601 

TCATAGGAGC GGGGAGGTGC TATCTGAATA TCCCAATGCA TCAAGACAAG ATGGATTCAG AAAACAAAGA AATTAAACAA ACATTTTAAA ATATGCTCTT 
AGTATCCTCG CCCCTCCACG ATAGACTTAT AGGGTTACGT AGTTCTGTTC TACCTAAGTC TTTTGTTTCT TrAATTTGTT TGTAAAATTT TATACGAGAA 

-581 -561 -541 -521 -501 

AGTTTTAGAT AATATAATGT TTTCAATACC AATTATCTTA CACTGATAGT GGTCAAGTTA CTAATCACTT TTAATAAATT GGTGATAGTC AAACGTATTG 
TCAAAATCTA TTATATTACA AAAGTTATGG TTAATAGAAT GTGACTATCA CCAGTTCAAT GATTAGTGAA AATTATTTAA CCACTATCAG TTTGCATAAC 

-481 -461 -441 -421 -401 

AAAATTATCG ATTTAAAAAT ATTTGAATTC AAAACCATTT TAGTGAAAGT TTGCATTGTA GTTTTGATTA TCCGATCAAT CTTTAATATA ATTACGTCAA 
TTTTAATAGC TAAATTTTTA TAAACTTAAG TTTTGGTAAA ATCACTTTCA AACGTAACAT CAAAACTAAT AGGCTAGTTA GAAATTATAT TAATGCAGTT 

.381 -361 -341 -321 -301 

TAATAACTGA AATCCTTGAA TTAACCGTTA C CCGATTCAT AAGCACTACT TTCCGATCAA AACCAATGAG ATAAAATAAC TTTTAAACCC TCCAAATAAA 
ATTATTGACT TTAGGAACTT AATTGGCAAT GGGCTAAGTA TTCGTGATGA AAGGCTAGTT TTGGTTACTC TATTTTATTG AAAATTTGGG AGGTTTATTT 

-281 -261 -241 -221 -201 

AAGAGAAAAC CTTAAAAACC AATTTCTGTT CGGTGGGGAT GATGATCGGA CTCGGACCGG TCTAACCGAC TGGATTAAAA AGTCTTTAAC AACGACAAGC 
TTCTCTTTTG GAATTTTTGG TTAAAGACAA GCCACCCCTA CTACTAGCCT GAGCCTGGCC AGATTGGCTG ACCTAATTTT TCAGAAATTG TTGCTGTTCG 

-181 -161 -141 -121 -101 

TTAAAAATTT GCCTCTTAGT GGCTTCAAAA CGCAATCGTT TCGCTTAATA CTATTATTTT CTCTATCTCG TTTAACCAAA AAAAAAAACG AGTTGGAGGA 
AATTTTTAAA CGGAGAATCA CCGAAGTTTT GCGTTAGCAA AGCGAATTAT GATAATAAAA GAGATAGAGC AAATTGGTTT TTTTTTTTGC TCAACCTCCT 

-81 -61 -41 -21 -1 

AAAAAAAAAC CAAGAAAAAA GAATAAAAAG CAAAAAGCAT TGAGCGTCTC CGGAGATTAG GATTAAATTA GGGCATAACC CTTATCGGAG ATTTGAAGCC 
■1-1-1-1-m TTG GTTCTTTTTT CTTATTTTTC GTTTTTCGTA ACTCGCAGAG GCCTCTAATC CTAATTTAAT CCCGTATTGG GAATAGCCTC TAAACTTCGG 

20 40 60 80 100 

ATGGGAAGAA GAAAAATCGA GATCAAGCGA ATCGAGAACA AAAGCAGTCG ACAAGTCACT TTCTCCAAAC GACGCAATGG TCTCATCGAC AAAGCTCGAC 
TACCCTTCTT CTTTTTAGCT CTAGTTCGCT TAGCTCTTGT TTTCGTCAGC TGTTCAGTGA AAGAGGTTTG CTGCGTTACC AGAGTAGCTG TTTCGAGCTG 

120 140 160 180 200 

AACTTTCGAT TCTCTGTGAA TCCTCCGTCG CTGTTGTCGT CGTATCTGCC TCCGGAAAAC TCTATGACTC TTCCTCCGGT GACGAGTAAG AAGATACTTT 
TTGAAAGCTA AGAGACACTT AGGAGGCAGC GACAACAGCA GCATAGACGG AGGCCTTTTG AGATACTGAG AAGGAGGCCA CTGCTCATTC TTCTATGAAA 

220 240 260 280 300 

CCTTTTCTGG GTCTCACTCG ATTTTTGTGC TTTTTTACTT TGTTTAATTA CTTTCTCCAT ATAGAAGCTT CAAATCTAGG GCTTTTTGAT TCCATCAAAT 
GGAAAAGACC CAGAGTGAGC TAAAAACACG AAAAAATGAA ACAAATTAAT GAAAGAGGTA TATCTTCGAA GTTTAGATCC CGAAAAACTA AGGTAGTTTA 

320 340 360 380 400 

CAACTGAGAT TTTCTCCTTG TTTTCTGTAT GAAGATAGCA GATGCGTAAG CTTTAACCTA ATTTAAGACT AAACATTTTG ATCGCCAAGA TATGTTCTTG 
GTTGACTCTA AAAGAGGAAC AAAAGACATA CTTCTATCGT CTACGCATTC GAAATTGGAT TAAATTCTGA TTTGTAAAAC TAGCGGTTCT ATACAAGAAC 

420 440 460 480 500 

ATGTTCGTTT CGTGTTTTTT TTTTCGTGTT nTmTrn TCATTTTAAA ATCATTTTTA TCTCTTTTTT TACCTTCATT TGTGACGAAA TTTAATATTG 
TACAAGCAAA GCACAAAAAA AAAAGCACAA AAAAAAAAAA AGTAAAATTT TAGTAAAAAT AGAGAAAAAA ATGGAAGTAA ACACTGCTTT AAATTATAAC 

520 540 560 580 600 

CATGTTATTC AAGAAACTTT TCTACACGTG GTGATTCGTT CTTGATGTTG TTTAAGTAAT CTTTGTATTG CTAGTTCCAT CTGTTGTTCA CTTTGAAGCT 
GTACAATAAG TTCTTTGAAA AGATGTGCAC CACTAAGCAA GAACTACAAC AAATTCATTA GAAACATAAC GATCAAGGTA GACAACAAGT GAAACTTCGA 

620 640 660 680 700 

TCGTTTTTTC ATATAAGAAA CAATATGTTT AGATTGTTCA AATTTTGAGA TTTGGTAATT ATATTCAATA TTGCAATGCA CTTCAAGTAG TTTTGTTGAG 
AGCAAAAAAG TATATTCTTT GTTATACAAA TCTAACAAGT TTAAAACTCT AAAC C ATTAA TATAAGTTAT AACGTTACGT GAAGTTCATC AAAACAACTC 

720 740 760 780 80° 

AGATTATTTG GGGTTAGTGG TAACATTAAT CGAATATCTT TGGTTCAAAT TGGTTAACAC ATTGTACTTT ATGTTGATCC AAAATGTATT GTAGATCTTT 
TCTAATAAAC CCCAATCACC ATTGTAATTA GCTTATAGAA ACCAAGTTTA ACCAATTGTG TAACATGAAA TACAACTAGG TTTTACATAA CATCTAGAAA 

820 840 860 880 900 

TCTTTTGTAA TTCTCTTTAA GGAATAAGGT TTATCTAGTT GATTTTGATG GTTTATTGTA GTGCTGGGAT AAGTTTC C AC ATTGATACTC GCCACACATT 
AGAAAACATT AAGAGAAATT CCTTATTCCA AATAGATCAA CTAAAACTAC CAAATAACAT CACGACCCTA TTCAAAGGTG TAACTATGAG CGGTGTGTAA 

920 940 960 980 1000 

CTTCATTACT TAACTAATTG GATATCGATT TTAACCCTTT TAATCGTAAT TTGTTGTGTG TTTATGACAC CATACAAGAT ACATTATGTC TTACTGAGTG 
GAAGTAATGA ATTGATTAAC CTATAGCTAA AATTGGGAAA ATTAGCATTA AACAACACAC AAATACTGTG GTATGTTCTA TGTAATACAG AATGACTCAC 

1020 1040 1060 1080 1100 

ACTCTTTGTT GCTCTCTAAG ATGTTGTAGT TTGGATTTCT TTGCTAAAGA AACTCAAACT ATAACTGATT TTACTGCTAC CATATATATG TCAGTGGCCT 
TGAGAAACAA CGAGAGATTC TACAACATCA AACCTAAAGA AACGATTTCT TTGAGTTTGA TATTGACTAA AATGACGATG GTATATATAC AGTCACCGGA 

1120 1140 1160 1180 1200 

AGTAGGTTCA TTAAGTAGAA ATCGGTCGCC AATTTTACTA ATTGGGAGAA ACCACTAGAC TACAACCAAA TGTTCAATGA CTTTAATAGT CTTCTGTTAT 
TCATCCAAGT AATTCATCTT TAGCCAGCGG TTAAAATGAT -TAACCCTCTT TGGTGATCTG ATGTTGGTTT ACAAGTTACT GAAATTATCA GAAGACAATA 

1220 1240 1260 1280 13 00 

TTGTCGTGGA TATTTTTAAC CCCATGAACT TTTGTATCTA GAAAAATCTC ATCCACTTCT CTTTTAGAAT ACTTTGAATG CGACT AAAA G TGAGTTTTTT 
AACAGCACCT ATAAAAATTG GGGTACTTGA AAACATAGAT CTTTTTAGAG TAGGTGAAGA GAAAATCTTA TGAAACTTAC GCTGATTTTC ACTCAAAAAA 

1120 1340 1360 1380 1400 
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TTTTCTAATA GACCTAAGAT AAAATCATCA ATGGATAAGT AGGAAATGGA AAGGTAACTC TTGTCAGTAT GTGTATATAT ACAGCTC CTT CTCATTTCCT 
AAAAGATTAT CTGGATTCTA TTTTAGTAGT TACCTATTCA TCCTTTACCT TTCCATTGAG AACAGTCATA CACATATATA TGTCGAGGAA GAGTAAAGGA 

1420 1440 1460 1480 1500 

TGATGTTGAC TCCATAAATG CTTGATCATG AAAGCAAATT TGTTAAATTT GTAACCAACA AAATGCACAG ACTATAGACG AAGTATTAGG AACCGTATCT 
ACTACAACTG AGGTATTTAC GAACTAGTAC TTTCGTTTAA ACAATTTAAA CATTGGTTGT TTTACGTGTC TGATATCTGC TTCATAATCC TTGGCATAGA 

1520 1540 1560 1580 1600 

ATCTGTCTCC ATTTTACAAT AGTCAAGCTC TAGTTGTAGC TAGTTTCTTT ATTTAGTTCT TATACCTTAA CAAAGTGGCA CTATGCAAAG TGTTTTTAGT 
TAGACAGAGG TAAAATGTTA TCAGTTCGAG ATCAACATCG ATCAAAGAAA TAAATCAAGA ATATGGAATT GTTTCACCGT GATACGTTTC ACAAAAATCA 

1620 1640 1660 1680 1700 

TGAGATTAGT CGTCTTATGC GTCTTACTAA TTGTTCATTT TTTCTTCTTT TTGTGATTGA TGTAAAATTA CTAAGTCACA ACTTGAGATG TTACTAAAAA 
ACTCTAATCA GCAGAATACG CAGAATGATT AACAAGTAAA AAAGAAGAAA AACACTAACT ACATTTTAAT GATTCAGTGT TGAACTCTAC AATGATTTTT 

1720 1740 1760 1780 1800 

GATAAGAACG TGTAATAACT GAAGTGAATT TGAAGCCAGT CTCTATTCAT ATCATAGCAT TAATAGATCA TGGACAACAC ATATATAGGA TTAGAGCTGT 
CTATTCTTGC ACATTATTGA CTTCACTTAA ACTTCGGTCA GAGATAAGTA TAGTATCGTA ATTATCTAGT ACCTGTTGTG TATATATCCT AATCTCGACA 

1820 1840 1860 1880 1900 

CATGACCTTC CCGGAAATGC TAAATCAGTT TCTTGGTTTA TCCTTTTTGG AGTATCATGA TATCATTTAG CCAAAGGTTT TTGGTTTCAG TATTCCGATT 
GTACTGGAAG GGCCTTTACG ATTTAGTCAA AGAACCAAAT AGGAAAAACC TCATAGTACT ATAGTAAATC GGTTTCCAAA AACCAAAGTC ATAAGGCTAA 

1920 1940 1960 1980 2000 

CGTTTGACGT TATGTGTGAA AGCGTCAATA ACTAAAACTT GGATTGACTA GTCAAAATAT AAACTGATTG CATTGAATTC TTGAAAATTT TCCCTTAAAA 
GCAAACTGCA ATACACACTT TCGCAGTTAT TGATTTTGAA CCTAACTGAT CAGTTTTATA TTTGACTAAC GTAACTTAAG AACTTTTAAA AGGGAATTTT 

2020 2040 2060 2080 2100 

TGAACATGAA TTTCATCAAG ATTTTGTCTT TTGGAAGGAT GTGATTTATA ATCTATACAA TCATACATTT TGCATGATAT TAGTTTTTTG AAGAACCAAA 
ACTTGTACTT AAAGTAGTTC TAAAACAGAA AACCTTCCTA CACTAAATAT TAGATATGTT AGTATGTAAA ACGTACTATA ATCAAAAAAC TTCTTGGTTT 

2120 2140 2160 2180 2200 

AATAGAGCTT CTTTATAAAA CTGATTTAGC CTTGATAAGA AAAAGAAGGT AGATAATCGA ACTCATGGGG ATGAGTTAAA AATGTGTGCA CTTAGTTTCT 
TTATCTCGAA GAAATATTTT GACTAAATCG GAACTATTCT TTTTCTTCCA TCTATTAGCT TGAGTAC CCC TACTCAATTT TTACACACGT GAATCAAAGA 

2220 2240 2260 2230 2300 

AAAACCTTTT GAAGTCGAAA CAATGACAAT ATTGGCTGCG AAGTTGATAT ATAACAGGAT CTTAAAGTTG AAATTGTAAA TTCAGATTTT AATTTTAGAG 
TTTTGGAAAA CTTCAGCTTT GTTACTGTTA TAACCGACGC TTCAACTATA TATTGTCCTA GAATTTCAAC TTTAACATTT AAGTCTAAAA TTAAAATCTC 

2320 2340 2360 2380 2400 

CACCAGATGA TCAGAGTTTC AGATTTACAT TTGAAGTATA AAACATTTTG AACACATATA TCTAAAGCAG TAACTTCAAA AATAGGGTAA CTAATAGTAA 
GTGGTCTACT AGTCTCAAAG TCTAAATGTA AACTTCATAT TTTGTAAAAC TTGTGTATAT AGATTTCGTC ATTGAAGTTT TTATCC CATT GATTATCATT 

2420 2440 2460 2480 2500 

CTTACATTGT TTTTTTTAAT GCTTTTATAC TTACTATCAT TTTTATATAT AGATGCCTGG TTAAGTAAAG ATGATTATCA AAAACTGTTG GTTAGTAACA 
GAATGTAACA AAAAAAATTA CGAAAATATG AATGATAGTA AAAATATATA TCTACGGACC AATTCATTTC TACTAATAGT TTTTGACAAC CAATCATTGT 

2520 2540 2560 2580 2600 

GAAATTGTTG CAAATGTAAC ATATTATATA AGCTTTCTTT CACTTTGGTG CATTCTCTCT AAATAATGGC CTCTATTGAT GCAGTATCTG ATTCTTAGTT 
CTTTAACAAC GTTTACATTG TATAATATAT TCGAAAGAAA GTGAAACCAC GTAAGAGAGA TTTATTACCG GAGATAACTA CGTCATAGAC TAAGAATCAA 

2620 2640 2660 2680 2700 

TTGAAATGGT TTTTGCATAA ATTATTGTTC TAATGCATTT TTGTTTTATC TCCAGCATTT CCAAGATCAT TGATCGTTAT GAAATACAAC ATGCTGATGA 
AACTTTACCA AAAACGTATT TAATAACAAG ATTACGTAAA AACAAAATAG AGGTCGTAAA GGTTCTAGTA ACTAGCAATA CTTTATGTTG TACGACTACT 

2720 2740 2760 2780 2800 

ACTTAGAGCC TTAGTAAGTA ATTAGCTAAG AACGTCATTC TAATATTCTT CTGGATGCGG TTTTTGGTGT TATGAAGGAT AGAAGCGCTG TTCAAGCCGG 
TGAATCTCGG AATCATTCAT TAATCGATTC TTGCAGTAAG ATTATAAGAA GACCTACGCC AAAAACCACA ATACTTCCTA TCTTCGCGAC AAGTTCGGCC 

2820 2840 2860 2B80 2900 

AGAAACCTCA ATGTTTTGAA CTCGTAACAC CGAACTTAAT TCTCTAGAGT TACAGTTATT GTGTCTACTG GAAAATACAA GAACTTCACA ATCTTTCTGA 
TCTTTGGAGT TACAAAACTT GAGCATTGTG GCTTGAATTA AGAGATCTCA ATGTCAATAA CACAGATGAC CTTTTATGTT CTTGAAGTGT TAGAAAGACT 

2920 2940 2960 2980 3000 

CCATTCCTTT TCTTCATGTG CAGGATCTTG AAGAAAAAAT TCAGAATTAT CTTCCACACA AGGAGTTACT AGAAACAGTC CAAAGGTTAG CAGTACGACA 
GGTAAGGAAA AGAAGTACAC GTCCTAGAAC TTCTTTTTTA AGTCTTAATA GAAGGTGTGT TCCTCAATGA TCTTTGTCAG GTTTCCAATC GTCATGCTGT 

3020 3040 3060 3080 3100 

CATTTTTCTC CCCTCTTCTT CTGATAAAAA AAATGTTTTT TTTCTTTTGT CTACTTGTGA ATACAGCAAG CTTGAAGAAC CAAATGTCGA TAATGTAAGT 
GTAAAAAGAG GGGAGAAGAA GACTATTTTT TTTACAAAAA AAAGAAAACA GATGAACACT TATGTCGTTC GAACTTCTTG GTTTACAGCT ATTACATTCA 

3120 3140 3160 3180 3200 

GTAGATTCTC TAATTTCTCT GGAGGAACAA CTTGAGACTG CTCTGTCCGT AAGTAGAGCT AGGAAGGTAT ATGTGCTGCT ACTAAGTGAT TCAACCAATT 
CATCTAAGAG ATTAAAGAGA CCTCCTTGTT GAACTCTGAC GAGACAGGCA TTCATCTCGA TCCTTCCATA TACACGACGA TGATTCACTA AGTTGGTTAA 

3220 3240 3260 3280 3300 

ACTCCACAAA ACCTTCTTTT TAGTTAGTTA TCCTAGAACA ATCTTTTGAC ATAAATCTTA ATGTCTTGTT ATAGGCAGAA CTGATGATGG AGTATATCGA 
TGAGGTGTTT TGGAAGAAAA ATCAATCAAT AGGATCTTGT TAGAAAACTG TATTTAGAAT TACAGAACAA TATCCGTCTT GACTACTACC TCATATAGCT 

3320 3340 3360 3380 3400 

GTCCCTTAAA GAAAAGGTTA GTGCTTTGGT TTTTATTTTC GATAAAGGCC ATATTCTAGG CTATGATGAT TCTTGAATTC TATTAACCTG CTGAGTCTAC 
CAGGGAATTT CITTTCCAAT CACGAAACCA AAAATAAAAG CTATTTCCGG TATAAGATCC GATACTACTA AGAACTTAAG ATAATTGGAC GACTCAGATG 

3420 3440 3460 3480 3500 

AGATTACTAT ATATATATAT ATATATCTTT TGGTCTTGTC TTAGTTCCTG ATTTAGTATT GGCTTCATTC AGGTGAAACC CTAATGAGAA TTAAAAAAAC 
TCTAATGATA TATATATATA TATATAGAAA ACCAGAACAG AATCAAGGAC TAAATCATAA CCGAAGTAAG TCCACTTTGG GATTACTCTT AATTTTTTTG 

3520 3540 3560 3580 3600 

AAGCAGTTTT AAACTCTTGA TCAAATCCAA CCTTTCCCTC ATAAAGTGTC GAATTTGGAT GAGGATGATT TATGTTTCGA GAAGGAAACA TGTTTGGAAA 
TTCGTCAAAA TTTGAGAACT AGTTTAGGTT GGAAAGGGAG TATTTCACAG CTTAAACCTA CTCCTACTAA ATACAAAGCT CTTCCTTTGT ACAAACCTTT 

3620 3640 3660 3680 3700 

TAGCTATAGA AGTTGTTAGA AACTAATGAC CTTATGATCT TTTCCAAACA GGAGAAATTG CTGAGAGAAG AGAACCAGGT TCTGGCTAGC CAGGTAACAA 
ATCGATATCT TCAACAATCT TTGATTACTG GAATACTAGA AAAGGTTTGT CCTCTTTAAC GACTCTCTTC TCTTGGTCCA AGACCGATCG GTCCATTGTT 

3720 3740 3760 3780 3800 

TGACCACAAT ATCTTCTGCT CTTGAAGCTA ATTAATCACT TTATACGTCC CCGTTATAGA GAGATACACA TATACACGTA CATGAAAACT AAAAGTTGAA 
ACTGGTGTTA TAGAAGACGA GAACTTCGAT TAATTAGTGA AATATGCAGG GGCAATATCT CTCTATGTGT ATATGTGCAT GTACTTTTGA TTTTCAACTT 

3820 3840 3B60 3880 3900 

GGACTTTGAT GGATACTAGA CAATTATAGT GAAACC CTAA ATATGTGATA AGTGATAACA AAATGCTTTT AAAATCTATC TTTCTTGTTA ATTTAGTAGC 
CCTGAAACTA CCTATGATCT GTTAATATCA CTTTGGGATT TATACACTAT TCACTATTGT TTTACGAAAA TTTTAGATAG AAAGAACAAT TAAATCATCG 
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3920 3940 3960 3980 4000 

TGTCAGAGAA GAAAGGTATG TCTCACCGAT GAAAGATACT CAAAACCCGG TATTTTTAAT TTGTGAAATT TGCAAATAAA AAAAATGCTT TCTACAAGAT 
ACAGTCTCTT CTTTCCATAC AGAGTGGCTA CTTTCTATGA GTTTTGGGCC ATAAAAATTA AACACTTTAA ACGTTTATTT TTTTTACGAA AGATGTTCTA 

4020 4040 4060 4080 4100 

AGATTAATTT CTTGCAATGT TTAGTAGCTG TAGAAAAAAA AGAAATGTAA GAAAGTTTCT TACAGATGGG AAAGAATACG TTGCTGGCAA CAGATGATGA 
TCTAATTAAA GAACGTTACA AATCATCGAC ATCTTTTTTT TCTTTACATT CTTTCAAAGA ATGTCTACCC TTTCTTATGC AACGACCGTT GTCTACTACT 

4120 4140 4160 4180 4200 

GAGAGGAATG TTTCCGGGAA GTAGCTCCGG CAACAAAATA CCGGAGACTC TCCCGCTGCT CAATTAGCCA CCATCATCAA CGGCTGAGTT TTCACCTTAA 
CTCTCCTTAC AAAGGCCCTT CATCGAGGCC GTTGTTTTAT GGCCTCTGAG AGGGCGACGA GTTAATCGGT GGTAGTAGTT GCCGACTCAA AAGTGGAATT 

SEQ ID NO: 48 and SEQ ID NO: 49 

Alternatively splice Arabidopsis AGL27 cDNA and resulting Alternate ArabidopsiS' AGL27 amino 
acid sequence 

20 40 60 80 100 

ATGGGAAGAA GAAAAATCGA GATCAAGCGA ATCGAGAACA AAAGCAGTCG ACAAGTCACT TTCTCCAAAC GACGCAATGG TCTCATCGAC AAAGCTCGAC 
TACCCTTCTT CTTTTTAGCT CTAGTTCGCT TAGCTCTTGT TTTCGTCAGC TGTTCAGTGA AAGAGGTTTG CTGCGTTACC AGAGTAGCTG TTTCGAGCTG 
M G R RK1E IKR IEN KSSR QVT FSK RRNG LID KAR> 

120 140 160 180 200 

AACTTTCGAT TCTCTGTGAA TCCTCCGTCG CTGTTGTCGT CGTATCTGCC TCCGGAAAAC TCTATGACTC TTCCTCCGGT GACGAGATAG AAGCGCTGTT 

TTGAAAGCTA AGAGACACTT AGGAGGCAGC GACAACAGCA GCATAGACGG AGGCCTTTTG AGATACTGAG AAGGAGGCCA CTGCTCTATC TTCGCGACAA 

QLSI LCE SSV AVVV VSA SGK LYDS SSG DEI E A L F> 

220 240 260 280 300 

CAAGCCGGAG AAACCTCAAT GTTTTGAACT CGATCTTGAA GAAAAAATTC AGAATTATCT TCCACACAAG GAGTTACTAG AAACAGTCCA AAGCAAGCTT 

GTTCGGCCTC TTTGGAGTTA CAAAACTTGA GCTAGAACTT CTTTTTTAAG TCTTAATAGA AGGTGTGTTC CTCAATGATC TTTGTCAGGT TTCGTTCGAA 

KPE KPQ CFEL DLE EKI QNYL P H K ELL ETVQ S K L> 

320 340 360 380 400 

GAAGAACCAA ATGTCGATAA TGTAAGTGTA GATTCTCTAA TTTCTCTGGA GGAACAACTT GAGACTGCTC TGTCCGTAAG TAGAGCTAGG AAGGCAGAAC 

CTTCTTGGTT TACAGCTATT ACATTCACAT CTAAGAGATT AAAGAGACCT CCTTGTTGAA CTCTGACGAG ACAGGCATTC ATCTCGATCC TTCCGTCTTG 

EEP NVDN VSV DSL ISLE EQL ETA LSVS HAR K A E> 

420 440 460 480 500 

TGATGATGGA GTATATCGAG TCCCTTAAAG AAAAGGAGAA ATTGCTGAGA GAAGAGAACC AGGTTCTGGC TAGCCAGATG GGAAAGAATA CGTTGCTGGC 

ACTACTACCT CATATAGCTC AGGGAATTTC TTTTCCTCTT TAACGACTCT CTTCTCTTGG TCCAAGACCG ATCGGTCTAC CCTTTCTTAT GCAACGACCG 

LMME YIE SLK EKEK LLR EEN QVLA SQM GKH T L. L As 

520 540 560 

AACAGATGAT GAGAGAGGAA TGTTTCCGGG AAGTAGCTCC GGCAACAAAA TACCGGAGAC TCTCCCGCTG CTCAATTAG 

TTGTCTACTA CTCTCTCCTT ACAAAGGCCC TTCATCGAGG CCGTTGTTTT ATGGCCTCTG AGAGGGCGAC GAGTTAATC 

TDD ERG MFPG SSS GKK IPET LPL LN*> 
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